Two ways to score validation data in proc logistic

Deepanshu Bhalla 6 Comments

This article explains two ways to score a validation dataset in PROC LOGISTIC in SAS. In simple words, scoring means using a model you have already trained to make predictions for new data.

1. SCORE Option in PROC LOGISTIC

The SCORE option in PROC LOGISTIC is used to score new observations using a fitted logistic regression model. In other words, it applies coefficients of the model to new data to calculate predicted probabilities for those new observations.

Proc Logistic Data = training;
Model Sbp_flag = age_flag bmi_flag/ lackfit ctable pprob =0.5;
Output out= test p=ppred;
Score data=validation out = Logit_File;
Run;

2. OUTMODEL / INMODEL Option in PROC LOGISTIC

In the OUTMODEL= option, you can specify the the name of the SAS data set that contains the information about the model. This data set is used to score new data. It is used as the input to the INMODEL= option.

Proc Logistic Data = training outmodel= model;
Model Sbp_flag = age_flag bmi_flag/ lackfit ctable pprob =0.5;
Output out= test p=ppred;
Run;

proc logistic inmodel=model;
score data=validation out=valid;
run;
Related Posts
Spread the Word!
Share
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

6 Responses to "Two ways to score validation data in proc logistic"
  1. Pls when is the best time to split a data set into training and validation - at the begining after forming the modeling data set or after cleaning the data (missing value imputation and outlier treatment)?

    ReplyDelete
  2. Pls when is the best time to split a data set into training and validation - at the begining after forming the modeling data set or after cleaning the data (missing value imputation and outlier treatment)?

    ReplyDelete
  3. i split the data after cleaning the data , after missing value imputation but before outlier treatment. I do outlier treatment , during variable transformation, after initial run of proc logistic.

    ReplyDelete
  4. split the data into training & modeling after cleaning,removing missing values and outlier, transformation. After that we run the proc logistic model.

    ReplyDelete
  5. the predicted value we get from that is that the odds ratio?

    ReplyDelete
  6. may I know where can I get your sample training data?

    ReplyDelete
Next → ← Prev