Calculating Concordant, Discordant and Tied Pairs

This tutorial provides detailed explanation and steps to calculate concordance, discordance and c statistics (AUC) with example. By default, every statistical packages like SAS,SPSS and R generates these model fit measures when you run syntax for logistic regression. However, it is important to know how these model performance metrics are calculated mathematically. One more reason to know the calculation behind these metrics is it would give you an edge over your peers when your predictive model demands calibration or refitting.
Understanding Concordance and AUC

Download the SAS data file from UCLA website.

Steps to calculate concordance / discordance and AUC
  1. Calculate the predicted probability in logistic regression model.
  2. Divide the data into two datasets. One dataset contains observations having actual value of dependent variable with value 1 (i.e. event) and corresponding predicted probability values. And the other dataset contains observations having actual value of dependent variable 0 (non-event) against their predicted probability scores.
  3. Compare each predicted value in first dataset with each predicted value in second dataset.
  4.    Total Number of pairs to compare = x * y
       x:  Number of observations in first dataset (actual values of 1 in dependent variable)
       y: Number of observations in second dataset (actual values of 0 in dependent variable).

    In this step, we are performing cartesian product (cross join) of events and non-events. For example, you have 100 events and 1000 non-events. It would create 100k (100*1000) pairs for comparison.

  5. A pair is concordant if 1 (observation with the desired outcome i.e. event) has a higher predicted probability than 0 (observation without the outcome i.e. non-event).
  6. A pair is discordant if 0 (observation without the desired outcome i.e. non-event) has a higher predicted probability than 1 (observation with the outcome i.e. event).
  7. A pair is tied if 1 (observation with the desired outcome i.e. event) has same predicted probability than 0 (observation without the outcome i.e. non-event).
  8. The final percent values are calculated using the formula below -
Percent Concordant = (Number of concordant pairs)/Total number of pairs
Percent Discordance = (Number of discordant pairs)/Total number of pairs
Percent Tied = (Number of tied pairs)/Total number of pairs
Area under curve (c statistics) = Percent Concordant + 0.5 * Percent Tied

Interpretation of Concordant, Discordant and Tied Percent

Percent Concordant : Percentage of pairs where the observation with the desired outcome (event) has a higher predicted probability than the observation without the outcome (non-event).

Percent Discordant : Percentage of pairs where the observation with the desired outcome (event) has a lower predicted probability than the observation without the outcome (non-event).

Percent Tied : Percentage of pairs where the observation with the desired outcome (event) has same predicted probability than the observation without the outcome (non-event).

c statistics (AUC) : c-statistics is also called area under curve (AUC). It is calculated by adding Concordance Percent and 0.5 times of Tied Percent

In general, higher percentages of concordant pairs and lower percentages of discordant and tied pairs indicate a more desirable model.


SAS Code for Concordant / Discordant / AUC :

The code below calculates these performance metrics in SAS. This program executes each step explained above theoretically.
/* Creates library reference. The data file is stored in this directory*/
    libname file "C:\Users\Deepanshu\Downloads";

/* Run logistic regression and generate estimated probability in the dataset named "estprob" with variable name "pred"*/
    Proc logistic data= file.binary descending;
    class rank / param=ref ;
    model admit = gre gpa rank;
    output out = estprob p= pred;
    run;

    /*Divide the data into two datasets- event and non-event*/ 
    Data event nonevent;
    Set estprob;
    If admit = 1 then output event;
    else if admit = 0 then output nonevent;
    run;

    /*Cartesian product of event and non-event actual cases*/ 
    Proc SQL noprint;
    create table pairs as
    select a.admit as admit1, b.admit as admit0,
    a.pred as pred1,b.pred as pred0
    from event a cross join nonevent b;
    quit;

    /*Calculating concordant,discordant and tied percent*/
    Data pairs;
    set pairs;
    concordant =0;
    discordant=0;
    tied=0;
    If pred1 > pred0 then concordant = 1;
    else If pred1 < pred0 then discordant = 1;
    else tied = 1;
    run; 

    /*Mean values - Final Result*/
    Proc Means Data= Pairs Mean;
    Var Concordant Discordant Tied;
    Run;

Best Online Course : Practical SAS Programming with 50+ Case Studies

- Explain Programming Concepts in Simple English
- Live Projects & Case Studies
- Job Placement Assistance
- Money Back Guarantee


SAS Tutorials : 100 Free SAS Tutorials

About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has close to 7 years of experience in data science and predictive modeling. During his tenure, he has worked with global clients in various domains like retail and commercial banking, Telecom, HR and Automotive.


While I love having friends who agree, I only learn from those who don't.

Let's Get Connected: Email | LinkedIn

Get Free Email Updates :
*Please confirm your email address by clicking on the link sent to your Email*

Related Posts:

15 Responses to "Calculating Concordant, Discordant and Tied Pairs"

  1. Very precise and clear explanation of concordance and discordance. Also the code helps in better understanding of the phenomenon. Thanks.

    ReplyDelete
    Replies
    1. Thank you for your appreciation. Cheers!

      Delete
  2. Neat explanations, really helpful to understood these definitions. Thanks!

    ReplyDelete
  3. Very clear explanation, thank you :)

    ReplyDelete
  4. Thanks for the post! Shouldn't it be proc logistic with descending option? as we are treating 1s as events and 0 as nonevents

    ReplyDelete
    Replies
    1. Corrected! Thanks for pointing it out.

      Delete
  5. First time I understood concordance and discordance. Thanks

    ReplyDelete
  6. For a good model what should be the concordance?

    ReplyDelete
    Replies
    1. Concordance Percent should be 80 or above.

      Delete
  7. Very informative, clear, and to the point

    ReplyDelete
  8. Very good explanation and informative. Thanks Buddy keep sharing

    ReplyDelete
  9. Can you please give the calculation of concordance and disconcordance in excel format with example which will be easy to understand the calculation.

    ReplyDelete
  10. The above codes are very useful. Any suggestions for weighted data?

    ReplyDelete
  11. Hello, I want to know, what to do in cases where tied percentage is high, say 20%. How to reduce tied percentage?

    ReplyDelete

Next → ← Prev