Two ways to score validation data in proc logistic

This article explains two ways to score a validation dataset in PROC LOGISTIC in SAS. In simple words, scoring means using a model you have already trained to make predictions for new data.

1. SCORE Option in PROC LOGISTIC

The SCORE option in PROC LOGISTIC is used to score new observations using a fitted logistic regression model. In other words, it applies coefficients of the model to new data to calculate predicted probabilities for those new observations.

Proc Logistic Data = training;
Model Sbp_flag = age_flag bmi_flag/ lackfit ctable pprob =0.5;
Output out= test p=ppred;
Score data=validation out = Logit_File;
Run;

2. OUTMODEL / INMODEL Option in PROC LOGISTIC

In the OUTMODEL= option, you can specify the the name of the SAS data set that contains the information about the model. This data set is used to score new data. It is used as the input to the INMODEL= option.

Proc Logistic Data = training outmodel= model;
Model Sbp_flag = age_flag bmi_flag/ lackfit ctable pprob =0.5;
Output out= test p=ppred;
Run;

proc logistic inmodel=model;
score data=validation out=valid;
run;

About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

While I love having friends who agree, I only learn from those who don't
Let's Get Connected Email LinkedIn

Post Comment 6 Responses to "Two ways to score validation data in proc logistic"

AnonymousMay 13, 2015 at 4:47 PM
Pls when is the best time to split a data set into training and validation - at the begining after forming the modeling data set or after cleaning the data (missing value imputation and outlier treatment)?
izyk nietiMay 15, 2015 at 2:18 AM
Pls when is the best time to split a data set into training and validation - at the begining after forming the modeling data set or after cleaning the data (missing value imputation and outlier treatment)?
AnonymousJuly 24, 2015 at 5:25 AM
i split the data after cleaning the data , after missing value imputation but before outlier treatment. I do outlier treatment , during variable transformation, after initial run of proc logistic.
AnonymousApril 12, 2017 at 7:02 AM
split the data into training & modeling after cleaning,removing missing values and outlier, transformation. After that we run the proc logistic model.
UnknownAugust 28, 2017 at 2:39 AM
the predicted value we get from that is that the odds ratio?
AnonymousMarch 1, 2019 at 4:06 AM
may I know where can I get your sample training data?