This tutorial explains multiple ways to read a large CSV file in R.
I have tested the following R code to read a CSV file upto 6 GB in size.
Method I : Reading Large CSV Files in R with data.table
The fread()
function from the data.table package is used for reading data from files quickly and efficiently. The following code uses the data.table package in R to read data from a CSV file into a data table called "mydata".
library(data.table) mydata = fread("C:\\Users\\Deepanshu\\Documents\\Testing.csv", header = TRUE)
"C:\\Users\\Deepanshu\\Documents\\Testing.csv": This is the file path where the CSV file is located. You may need to modify this path to match the actual location of your file on your computer. Make sure to use either two backward slashes or forward slash in the file path.
header = TRUE: This argument specifies that the first row of the CSV file contains column names. Setting header = TRUE ensures that the function reads the first row as column names and uses them to name the columns in the mydata data table.
Method II : Reading Large CSV Files in R with bigmemory
The read.big.matrix() function is used to read data from a CSV file into a big.matrix object, and the as.matrix() function is used to coerce the big.matrix to a regular matrix object. The type = "integer" argument specifies that the data in the CSV file should be read as integers.
library(bigmemory) y <- read.big.matrix("C:\\Users\\Deepanshu\\Documents\\Testing.csv", type = "integer", header=TRUE) #coerce a big.matrix to a matrix mydata = as.matrix(y)
mydata = as.matrix(y): This line converts the big.matrix object y into a regular matrix object mydata using the as.matrix() function. The data from the big.matrix is now stored in a standard R matrix, allowing you to perform various data manipulations and analyses using regular matrix operations and functions.
Nice, very useful
ReplyDeleteneat
ReplyDeleteFine, I want algorithm step for k nearest neighbor
ReplyDelete