Importing Data into R

Deepanshu Bhalla 4 Comments
This tutorial explains how to get external data into R. It describes how to load data from various sources such as CSV, text, excel. SAS or SPSS.

Importing Data in R

Loading data into the tool is one of the initial step of any project. If you have just started using R, you would soon need to read in data from other sources.
Read Data into R
1. Reading a comma-delimited text file (CSV)

If you don't have the names of the variables in the first row
mydata <- read.csv("c:/mydata.csv", header=FALSE)
Note : R uses forward slash instead of backward slash in filename

Important Note : BIG CSV Files should be imported with fread function of data.table.
library(data.table)
mydata = fread("c:/mydata.csv")
 If you have the header row in the first row
mydata <- read.csv("c:/mydata.csv", header=TRUE)
If you want to set any value to a missing value
mydata <- read.csv("c:/mydata.csv", header=TRUE, na.strings="."))
In this case, we have set "." (without quotes) to a missing value

If you want to set multiple values to missing values
mydata <- read.csv("c:/mydata.csv", header=TRUE, na.strings=  c("A" , "B" ))
In this case, we have set "A" and "B" (without quotes) to missing values


2. Reading a tab-delimited text file

If you don't have the names (headers) in the first row
mydata <- read.table("c:/mydata.txt")
Note : R uses forward slash instead of backward slash in filename

 If you have the names (headers) in the first row
mydata <- read.table("c:/mydata.txt", header=TRUE)

If you want to set any value to a missing value
mydata <- read.table("c:/mydata.txt", header=TRUE, na.strings="."))
In this case, we have set "." (without quotes) to a missing value

If you want to set multiple values to missing values
mydata <- read.table("c:/mydata.txt", header=TRUE, na.strings=  c("A" , "B" ))
In this case, we have set "A" and "B" (without quotes) to missing values


3. Reading Excel File

The best way to read an Excel file is to save it to a CSV format and import it using the CSV method
mydata <- read.csv("c:/mydata.csv", header=TRUE .

Step 1 : Install the package once
install.packages("readxl")

Step 2 : Define path and sheet name in the code below
library(readxl)
read_excel("my-old-spreadsheet.xls")
read_excel("my-new-spreadsheet.xlsx")
# Specify sheet with a number or name
read_excel("my-spreadsheet.xls", sheet = "data")
read_excel("my-spreadsheet.xls", sheet = 2)
# If NAs are represented by something other than blank cells,
# set the na argument
read_excel("my-spreadsheet.xls", na = "NA")
4. Reading SAS File

Step 1 : Install the package once
install.packages("haven")
Step 2 : Define path in the code below
library("haven")
read_sas("c:/mydata.sas7bdat")

5. Reading SPSS File

Step 1 : Install the package once
install.packages("haven")
Step 2 : Define path in the code below
library("haven")
read_spss("c:/mydata.sav")

6. Load Data from R
load("mydata.RData")
Related Posts
Spread the Word!
Share
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

Post Comment 4 Responses to "Importing Data into R "
  1. In the reading the SAS file section, library should not have apostrophe within.

    ReplyDelete
  2. Please how to import a BSON file in R ?

    ReplyDelete
  3. Very informative and well explained.Thank you.

    ReplyDelete
Next → ← Prev