Import Data to SAS |
You can enter your lines of data directly in your SAS program by using a DATALINES statement.
Let's start out by clarifying the main keywords associated with the following program.
The keywords are as follows:
- DATA - The DATA step always begins with a DATA statement. The purpose of the DATA statement is to tell SAS that you are creating a new data set i.e. outdata.
- INPUT - Use the INPUT statement to define the variables used in the data set.
- Dollar sign ($) - The dollar sign ($) is used to identify a variable as character type.
- DATALINES - The DATALINES statement is used to indicate that the lines following it contain the actual data.
- PROC PRINT - The PROC PRINT statement is used to print out the contents of the data set in the output window.
- RUN - The DATA step ends with a RUN statement.
DATA outdata; INPUT age gender $ dept obs1 obs2 obs3; DATALINES; 1 F 3 17 6 24 1 M 1 19 25 7 3 M 4 24 10 20 3 F 2 19 23 8 2 F 1 14 23 12 2 M 5 1 23 9 3 M 1 8 21 7 1 F 1 7 7 14 3 F 2 2 1 22 1 M 5 20 5 2 3 M 4 21 8 18 1 M 4 7 9 25 2 F 5 10 17 20 3 F 4 21 25 7 3 F 3 9 9 5 3 M 3 7 21 25 2 F 1 1 22 13 2 F 5 20 22 5 ; proc print; run;
You can also use CARDS instead of DATALINES. Both means the same. There is no difference between these two keywords. See the program below -
DATA outdata;
INPUT age gender $ dept obs1 obs2 obs3;
CARDS;
1 F 3 17 6 24
;
proc print;
run;
Reading Delimited Data
The default delimiter is blank. If you have a data file with other delimiters such as comma or tab you need to define the delimiter before defining the variables using INFILE and DLM = options.
Syntax : Infile 'file-description' dlm=','
- For tab delimiter, the syntax would be infile 'file-description' dlm='09'x
- For colon delimiter, the syntax would be infile 'file-description' dlm=':'
DATA outdata; INFILE Datalines dlm =","; INPUT age gender $ dept obs1 obs2 obs3; Datalines; 1,F,3,17,6,24 1,M,1,19,25,7 3,M,4,24,10,20 3,F,2,19,23,8 2,F,1,14,23,12 ; proc print; run;
Importing External Data into SAS
Method I : PROC IMPORT
PROC IMPORT is a SAS procedure to import external files into SAS. It automates importing process. You don't need to specify variable type and variable length to import an external file. It supports various formats such as excel file, csv, txt etc.
1. Importing an Excel File into SAS
The main keywords used in the following program are :
1. OUT - To specify name of a data set that SAS creates. In the program below, outdata is the data set saved in work library (temporary library)
2. DBMS - To specify the type of data to import.
3. REPLACE - To overwrite an existing SAS data set.
4. SHEET - To import a specific sheet from an excel workbook
5. GETNAMES - To include variable names from the first row of data.
PROC IMPORT DATAFILE= "c:\deepanshu\sampledata.xls" OUT= outdata DBMS=xls REPLACE; SHEET="Sheet1"; GETNAMES=YES; RUN;
If you are using SAS University Edition
You need to find shared folder which is generally available as
Folders > My Folders
In SAS University edition, file location would be
DATAFILE = "/folders/myfolders/sampledata.xls"
2. Importing a Tab-Delimited File into SAS
The program below is similar to the code of importing excel file. The only difference is DBMS = DLM and delimter = '09'x.
PROC IMPORT DATAFILE= "c:\deepanshu\sampledata.txt" OUT= outdata DBMS=dlm REPLACE; delimiter='09'x; GETNAMES=YES; RUN;
3. Importing a Comma-Delimited File with TXT extension
To get comma separated file with a txt extension into SAS, specify delimeter = ','
PROC IMPORT DATAFILE= "c:\deepanshu\sampledata.txt" OUT= outdata DBMS=dlm REPLACE; delimiter=','; GETNAMES=YES; RUN;
4. Importing a Comma-Delimited File with CSV extension
To get comma separated file into SAS, specify DBMS= CSV
PROC IMPORT DATAFILE= "c:\deepanshu\sampledata.txt" OUT= outdata DBMS=csv REPLACE; GETNAMES=YES; RUN;
5. Importing a Space-Delimited File
To extract a space delimited file, specify delimiter = '20'x
PROC IMPORT DATAFILE= "c:\deepanshu\sampledata.txt" OUT= outdata DBMS=dlm REPLACE; delimiter='20'x; GETNAMES=YES; RUN;
6. Importing a file containing multiple delimiter
If two or more delimiters, such as comma and tabs, quote them following delimiter = option
PROC IMPORT DATAFILE= "c:\deepanshu\sampledata.txt" OUT= outdata DBMS=dlm REPLACE; delimiter=','09'x '; GETNAMES=YES; RUN;
Method II : Get External File - INFILE
In SAS, there is one more method called INFILE to import an external file. It's a manual method of importing an external file as you need to specify variables and its types and length.
1. Reading a CSV File
INFILE statement - To specify path where data file is saved.
DSD - To set the default delimiter from a blank to comma.
FIRSTOBS=2 : To tell SAS that first row contains variable names and data values starts from second row.
data outdata; infile 'c:\users\deepanshu\documents\book1.csv' dsd firstobs=2; input id age gender $ dept $; run;
2. Reading a TAB Delimited File
We can use DLM='09'x to tell SAS that we are going to import a tab delimited file. The TRUNCOVER statement tells SAS to assign the raw data value to the variable even if the value is shorter than expected by the INPUT statement.
data outdata;How to handle an external file :
infile 'c:\deepanshu\dummydata.txt' DSD dlm='09'x truncover;
input employee :$30. DOJ :mmddyy8. state :$20.;
run;
Using a FILENAME statement to handle an external file.
FILENAME sample 'c:\deepanshu\sampledata.csv' ;
DATA outdata;
infile sample dsd;
INPUT age gender $ dept obs1 obs2 obs3;
run;
KEEP POSTING AS YOUR ARTICLES REALLY HELP TO GET GROUNDING ON BASICS.
ReplyDeleteThanks for your appreciation.
DeleteVery usefull! Appreciate!
Deletethank you so much :-)
ReplyDeleteClear and Crisp explanation! You made learning SAS easy....i thought it would be very tough to learn.
ReplyDeleteThanks a lots! :)
Thank you for your lovely words. Cheers!
DeleteThanks for providing these valuable topics
ReplyDeleteGlad you found it useful. Thanks!
DeleteThis site is really awesome, to learn and i can say compare to all sites this is best and very informative, after reading basic from this site, my most of doubts are cleared,
ReplyDeleteThank you sir, please keep posting.
Thank you for your lovely words. Cheers!
DeleteThis site is really awesome, to learn and i can say compare to all sites this is best and very informative, after reading basic from this site, my most of doubts are cleared,
ReplyDeleteThank you sir, please keep posting.
Hi, can you please elaborate more about truncover and DSD in infile. Urgent!! Also, please explain with examples.
ReplyDeleteHey, brilliant job done.
ReplyDeleteCould you please help me out with Proc import if i wish to import selected data ?
You can specify RANGE= option in PROC IMPORT. For eg. RANGE="Sheet1$B2:D10"
Deletethank you for giving wedsite
ReplyDeleteThank you for stopping by my website. Cheers!
DeleteI follow u on quora , a big fan of urs Excellent work sir :)
ReplyDeleteThank you Abhishek. Your compliment gave me motivation to work harder and much better than before. Cheers mate!
DeleteHi Deepanshu,
ReplyDeleteI have worked as a data analyst in an e-commerce company for 2 years. But the tool i used for analysing is Excel now i want to shift to SAS. What should i do in this career shift process. I have few more doubts in getting placed should i do certification? only then will i get a job, because i don have hands on experience in SAS i practised it by myself with the help of internet. What are the things can be done to get a real time experience and get placed soon in a good job
Pls do reply am in a crisis kinda situation in my career
Thanks,
Vivek
Deepanshu,
ReplyDeleteOne doubt, In "Importing a Comma-Delimited File with CSV extension". The raw file extension should be .csv in infile statement. Am I right? please suggest.
No, I've tried it using txt file with dlm=csv.
Delete
ReplyDeleteHi ,
I just ran the first program and getting below error. Please help me out.
/pbr/biconfig/940/Lev1/Web/WebAppServer/SASServer1_1/temp/SASStudioSessions/84920dae-c29e-4c52-96cc-085a498c522b/74e55a26-d843-46e2-a137-40f76f12a6fc.sas (No such file or directory)
sir if data contains commas in id variable and one comma in same name variable or different how to write code for dlm dsd; its getting blank in varibles
ReplyDeleteid name gender
101,kk f
102, hh, m
103, hhj m
but in 2nd obs its getting blank in gender how to solve the query sir
sir can u please show me how to get output for two variables with where statement
ReplyDeletewhere cylinders=4;
where length=180;
but its taking only cylinders not executing the data of length in cars dataset
pls solve the problem sir
To check for for both criteria :
Deletewhere cylinders=4 and length=180;
This will keep only the observations having cylinders=4 and length=180
With Multiple 'where' statements only the last where statement is executed.
Deleteso its wise to use the condition with AND OR conditions.
Hi sir this is clarity explanation and it is good & easy to understand.. and it is helpful in learning stage. I following the this website daily.. so thank you...
ReplyDeletehi,sir u gave us very good examples but sir,you are mentioned text files in the import statement but where are the files sir,would you provide that dummy files
ReplyDeleteThanks for your informative articles...it helps me understand sas concepts more easily...Keep posting! Thanks again
ReplyDeleteThe code mentioned for "importing a file with multiple delimiter" is not working. i am using SAS EG 7.1 64bit
ReplyDeleteI tried to import csv file using infile option. In my csv file I have 3 dates variables. When I m importing sas shows "invalid data for date_variable1"
ReplyDeletePlease help to resolve this
excellent tutorials. where is the sample data files?? reading any articles facing the same problem.
ReplyDeletegreat content
ReplyDeleteThanks for your notes..
ReplyDeleteThe content is well structured and of great help. Thank you for creating this.
ReplyDeleteso mush helpful.
ReplyDeletehi i'm trying to read delimited file by import proceudre please help me with the code
ReplyDeleteproc import
datafile="C:\Users\Madhu\Documents\SOURCE DATA RAR FILES\SOURCE DATA\DLM-TXT\data5.txt"
out = work.data5
dbms =dlm
replace
;
delimiter='09'x;
getnames=no;
run;
this is the data
101 kiran kumar 34 male Asp 05mg Asian
102 pavan 45 male Asp 10mg African
103 laxmi knath 56 female Asp 15mg Asian
i tried with datastep option and it read the data values efficiently but im unable to read the values by import procedure
is it tab delimited or space deliimited file?
ReplyDelete