In the past few years python has gained a huge popularity as a programming language in data science world. Many banks and pharma organisations have started using Python and some of them are in transition stage, migrating SAS syntax library to Python.
Many big organisations have been using SAS since early 2000 and they developed a hundreds of SAS codes for various tasks ranging from data extraction to model building and validation. Hence it's a marathon task to migrate SAS code to any other programming language. Migration can only be done in phases so day to day tasks would not be hit by development and testing of python code. Since Python is open source it becomes difficult sometimes in terms of maintaining the existing code. Some SAS procedures are very robust and powerful in nature its alternative in Python is still not implemented, might be doable but not a straightforward way for average developer or analyst.
Do you wish to run both SAS and Python programs in the same environment (IDE)? If yes, you are not the only one. Many analysts have been desiring the same. It is possible now via python package called saspy
developed by SAS. It allows flexibility to transfer data between Pandas Dataframe and SAS Dataset. Imagine a situation when you have data in pandas dataframe and you wish to run SAS statistical procedure on the same without switching between SAS and Python environment.
Access to SAS Software for free
First and Foremost is to have access to SAS either via cloud or server/desktop version of software.
If you don't have SAS software, you don't need to worry. You can get it for free without installation via SAS OnDemand for Academics It is available for free for everyone (not restricted to students or academicians). It includes access to all the commonly used SAS modules like SAS STAT, SAS ETS, SAS SQL etc. You just need to do registration once and it does not take more than 5 minutes.
saspy python package has the following dependencies :- Python 3.4 or higher
- SAS 9.4 or higher
Steps to access SAS in Python (Jupyter)
Please follow the steps below to make SAS run in Jupyter Notebook.
To install saspy package you can run the following command in Python.
!pip install saspy
The following program connects SAS OnDemand for Academics with Python.
import saspy sas = saspy.SASsession(java='C:\\Program Files (x86)\\Java\\jre1.8.0_221\\bin\\java.exe', iomhost=['odaws01-apse1.oda.sas.com','odaws02-apse1.oda.sas.com'], iomport=8591, encoding='utf-8') sas
You need to make two changes in this step.
-
It requires Java 7 or higher installed on your system. If you have Java already installed, you would it in
Program Files
folder where your softwares are installed. Make sure to change file location specified injava=
argument above. - Host name of SAS OnDemand for Academics needs to be listed in
iomhost
argument. Host name varies depending on your region. Open SAS onDemand for Academics and check your region (appears at the top right after you login).#US Home Region iomhost = ['odaws01-usw2.oda.sas.com','odaws02-usw2.oda.sas.com','odaws03-usw2.oda.sas.com','odaws04-usw2.oda.sas.com'] #European Home Region iomhost = ['odaws01-euw1.oda.sas.com','odaws02-euw1.oda.sas.com'] #Asia Pacific Home Region iomhost = ['odaws01-apse1.oda.sas.com','odaws02-apse1.oda.sas.com']
When you run the above program shown in step2, it asks for username and password of SAS onDemand for Academics. Once you enter both username and password, it shows message like below.
Using SAS Config named: default Please enter the IOM user id: deepanshu Please enter the password for IOM user : ········ SAS Connection established. Subprocess id is 3608 Access Method = IOM SAS Config name = default SAS Config file = C:\Users\DELL\Anaconda3\lib\site-packages\saspy\sascfg.py WORK Path = /saswork/SAS_work84BB00000491_odaws02-apse1.oda.sas.com/SAS_work0D4300000491_odaws02-apse1.oda.sas.com/ SAS Version = 9.04.01M6P11072018 SASPy Version = 3.6.4 Teach me SAS = False Batch = False Results = Pandas SAS Session Encoding = utf-8 Python Encoding value = utf-8 SAS process Pid value = 1169
%%SAS sas proc print data=sashelp.cars ; run;It returns the output as follows. You can also run like the code below. It is same as the above program, just a different style of writing and executing SAS command via saspy.
sas.submitLST("proc print data=sashelp.cars; run;", method='listorlog')
df2sd
converts pandas dataframe to sas dataset.
import pandas as pd pandasdf = pd.read_csv("deals.csv") sasdf = sas.df2sd(pandasdf, 'sasdf') sas.submitLST("proc print data=work.sasdf (obs=5);run;", method='listorlog')Function
sd2df
converts sas dataset to pandas dataframe.
pandasdf2 = sas.sd2df(sasdf.table)
pandasdf2.head()
You can also summarise pandas dataframe using pandasdf2.describe()
How to run SAS in Google Colab
The above step by step instructions are mainly designed for running python in Jupyter notebook which is the most commonly used interface for Python. Recently Google Colab has become a go-to tool for data science because of serveral reasons - supports version controling, notebooks saved in Google Drive, work from anywhere, supports GPU etc. In simple words it runs on cloud so you don't need to install python and popular python packages. Sharing code with your coworkers is also very easy and effective via colab. Java is already installed on colab. You just need to specify this file location/usr/bin/java
for java in step 2 (listed above).
import saspy sas = saspy.SASsession(java='/usr/bin/java', iomhost=['odaws01-apse1.oda.sas.com','odaws02-apse1.oda.sas.com'], iomport=8591, encoding='utf-8') sas
Make sure to check iomhost as per your region. Read Step 2 above.%%SAS sas magic does not work so you can use sas.submitLST( ) like below.
sas.submitLST("proc print data=sashelp.cars; run;", method='listorlog')You can read external data from this location
/content/
in google colab.
import pandas as pd pandasdf = pd.read_csv("/content/sample_data/california_housing_train.csv") sasdf = sas.df2sd(pandasdf, 'sasdf') sas.submitLST("proc print data=work.sasdf (obs=5);run;", method='listorlog')
How to run saspy with SAS Enterprise Guide
Idea is to connect to remote workspace server which SAS Enterprise Guide (EG) uses. You need hostname and port of the workspace server. Login credentials of EG can be used for authentication. See the syntax below and use it in saspy.SASsession( )
which is shown above in the first section of this article.
# Unix client and Unix IOM server NEW 2.1.6 - with load balanced object spawners iomlinux = {'java' : '/usr/bin/java', 'iomhost' : ['linux.grid1.iom.host','linux.grid2.iom.host','linux.grid3.iom.host','linux.grid4.iom.host'], 'iomport' : 8591, 'appserver' : 'SASApp Prod - Workspace Server' } # Unix client and Windows IOM server iomwin = {'java' : '/usr/bin/java', 'iomhost' : 'windows.iom.host', 'iomport' : 8591, 'appserver' : 'SASApp Test - Workspace Server' } # Windows client and Unix IOM server winiomlinux = {'java' : 'java', 'iomhost' : 'linux.iom.host', 'iomport' : 8591, } # Windows client and Windows IOM server winiomwin = {'java' : 'java', 'iomhost' : 'windows.iom.host', 'iomport' : 8591, } # Windows client and with IWA to Remote IOM server winiomIWA = {'java' : 'java', 'iomhost' : 'some.iom.host', 'iomport' : 8591, 'sspi' : True }
Excellent! Thank you so much
ReplyDeleteGood
ReplyDeleteVery helpful. Please also let us know how to run python code on sas viya
ReplyDeleteAfter enter user name I didn't get command to enter password,help me how to execute
ReplyDelete