There are three ways to calculate optimal probability cut-off :

- Youden's J Index
- Minimize Euclidean distance of sensitivity and specificity from the point (1,1)
- Profit Maximization / Cost Minimization

**Youden's J index**is used to select the optimal predicted probability cut-off. It is the maximum vertical distance between ROC curve and diagonal line. The idea is to maximize the difference between True Positive and False Positive.

**Youden Index Formula**

J = Sensitivity - (1 - Specificity )Optimal probability cutoff is at where J is maximum.

**Euclidean Distance Formula**

D = Sqrt ((1-Sensitivity)^2 + (1-Specificity)^2)Optimal probability cutoff is at where D is minimum.

**SAS Code**

proc logistic data = test descending;

model y = x1 x2 / outroc=rocstats;

run;

data check;

set rocstats;

_SPECIF_ = (1 - _1MSPEC_);

J = _SENSIT_ + _SPECIF_ - 1;

D= Sqrt((1-_SENSIT_)**2 + (1-_SPECIF_)**2);

run;

proc sql noprint;

create table cutoff as

select _PROB_ , J

from check

having J = max(J);

run;

proc sql noprint;

create table cutoff1 as

select _PROB_ , D

from check

having D = min(D);

run;

model y = x1 x2 / outroc=rocstats;

run;

data check;

set rocstats;

_SPECIF_ = (1 - _1MSPEC_);

J = _SENSIT_ + _SPECIF_ - 1;

D= Sqrt((1-_SENSIT_)**2 + (1-_SPECIF_)**2);

run;

proc sql noprint;

create table cutoff as

select _PROB_ , J

from check

having J = max(J);

run;

proc sql noprint;

create table cutoff1 as

select _PROB_ , D

from check

having D = min(D);

run;

how to generate confusion matrix using sas code?

ReplyDelete