A Complete Guide to Credit Risk Modelling

This article explains basic concepts and methodologies of credit risk modelling and how it is important for financial institutions. In credit risk world, statistics and machine learning play an important role in solving problems related to credit risk. Hence role of predictive modelers and data scientists have become so important. In banking under analytics division, it's one of the highest paid job.
Table of Contents

What is Credit Risk?

In simple words, it is the risk of borrower not repaying loan, credit card or any other type of loan. Sometimes customers pay some installments of loan but don't repay the full amount which includes principal amount plus interest. For example, you took a personal loan of USD 100,000 for 10 years at 9% interest rate. You paid a few initial installments of loan to the bank but stopped paying afterwards. Remaining unpaid installments are worth USD 30,000. It's a loss to the bank.

It's not restricted to retail customers but includes small, medium and big corporate houses. In news, you might have heard of Kingfisher Company became non-performing asset (NPA) which means the company had not been able to pay dues. High NPAs lead to huge financial losses to the bank which turns to reduction of interest rate on the deposit into banks. Serious honest borrowers with good credit history (credit score) would have to suffer. Hence it is essential that banks have sufficient capital to protect depositors from risks

Introduction to Credit Risk Analytics

Why Credit Risk is important?

Do you remember or aware of 2008 recession? In US, mortgage home loan were given to low creditworthy customers (individuals with poor credit score). Poor credit score indicates that one is highly likely to default on loan which means they are risky customers for bank. To compensate risk, banks used to charge higher interest rate than the normal standard rate. Banks funded these loans by selling them to investors on the secondary market. The process of selling them to investors is a legal financial method which is called Collateralized debt obligations (CDO). In 2004-2007, these CDOs were considered as low-risky financial instrument (highly rated).

As these home loan borrowers had high chance to default, many of the them started defaulting on their loans and banks started seizing (foreclose) their property. The real estate bubble burst and a sharp decline in home prices. Many financial institutions globally invested in these funds resulted to a recession. Banks, investors and re-insurers faced huge financial losses and bankruptcy of many financial and non-financial firms. Even non-financial firms were impacted badly because of either their investment in these funds or impacted because of a very low demand and purchasing activities in the economy. In simple words, people had a very little or no money to spend which leads to many organisations halted their production. It further leads to huge job losses. US Government bailed out many big corporate houses during recession. You may have understood now why credit risk is so important. The whole economy can be in danger if current and future credit losses are not identified or estimated properly.

Basel Regulations

A committee was set up in year 1974 by central bank governors of G10 countries. It is to ensure that banks have minimum enough capital to give back depositors’ funds. They meet regularly to discuss banking supervisory matters at the Bank for International Settlements (BIS) in Basel, Switzerland. The committee was expanded in 2009 to 27 jurisdictions, including Brazil, Canada, Germany, Australia, Argentina, China, France, India, Saudi Arabia, the Netherlands, Russia, Hong Kong, Japan, Italy, Korea, Mexico, Singapore, Spain, Luxembourg, Turkey, Switzerland, Sweden, South Africa, the United Kingdom, the United States, Indonesia and Belgium.

Basel I

Basel I accord is the first official pact introduced in year 1988. It focused on credit risk and introduced the idea of the capital adequacy ratio which is also known as Capital to Risk Assets Ratio. It is the ratio of a bank's capital to its risk. Banks needed to maintain ratio of at least 8%. It means capital should be more than 8 percent of the risk-weighted assets. Capital is an aggregation of Tier 1 and Tier 2 capital.

  1. Tier 1 capital : Primary funding source of the bank. It includes shareholders' equity and retained earnings
  2. Tier 2 capital : Subordinated loans, revaluation reserves, undisclosed reserves and general provisions
In Basel I, fixed risk weights were set based on the level of exposure. It was 50% for mortgages and 100% for non-mortgage exposures (like credit card, overdraft, auto loans, personal finance etc). See the example shown below -
Mortgage $5,000
Risk Weight 50%
Risk Weighted Assets $2500 (Mortage * Risk Weight)
Minimum Capital Required $200 (8% * Risk Weighted Assets)

Basel II

Basel II accord was introduced in June 2004 to eliminate the limitations of Basel I. For example, Basel I focused only on credit risk whereas Basel II focused not only credit risk but also includes operational and market risk. Operational Risk includes fraud and system failures. Market risk includes equity, currency and commodity risk.

In Basel II, there are following three ways to estimate credit risk.

  • Standardized Approach
  • Foundation Internal Rating Based (IRB) approach
  • Advanced Internal Rating Based (IRB) Approach
Standardized Approach
For corporate, the banks relies on ratings from certified credit rating agencies (CRAs) like S&P, Moody etc. to quantify required capital for credit risk. Risk weight is 20% for high rated exposures and goes up to 150 percent for low rated exposures. For retail, risk weight is 35% for mortgage exposures and 75% for non-mortgage exposures (no rating by credit rating agencies required for retail).
Corporate Exposure $5,00,000
Credit Assessment AAA
Risk Weights 20%
Risk Weighted Assets $1,00,000
Minimum Capital Required $8,000
Internal Ratings Based (IRB) Approach
It has four credit risk components :
  • Probability of Default (PD)
  • Exposure at Default (EAD)
  • Loss given Default (LGD)
  • Effective Maturity (M)
Probability of Default (PD)
Probability of default means the likelihood that a borrower will default on debt (credit card, mortgage or non-mortgage loan) over a one-year period. In simple words, it returns the expected probability of customers fail to repay the loan. Probability is expressed in the form of percentage, lies between 0% and 100%. Higher the probability, higher the chance of default.
Exposure at Default (EAD)
It means how much should we expect the amount outstanding to be in the case of default. It is the amount that the borrower has to pay the bank at the time of default.
Loss given Default (LGD)
It means how much of the amount outstanding we expect to lose. It is a proportion of the total exposure when borrower defaults. It is calculated by (1 - Recovery Rate).
LGD = (EAD – PV(recovery) – PV(cost)) / EAD
PV (recovery)= Present value of recovery discounted till time of default.
PV (cost) = Present value of cost discounted till time of default.
Expected Loss
Expected Loss is calculated by (PD * LGD * EAD).
Someone takes $100,000 home loan from bank for purchase of flat. At the time of default, loan has an outstanding balance of $70,000. Bank foreclosed flat and sold it for $60,000. EAD is $70,000. LGD is calculated by dividing ($70,000 - $60,000)/$70,000 i.e. 14.3%.
Probability of Default 2%
Exposure at Default $20,000
Loss Given Default 20%
Expected Loss $80
Foundation and Advanced IRB Approach
There are two types of Internal Rating Based (IRB) approaches which are Foundation IRB and Advanced IRB.
Foundation IRB
PD is estimated internally by the bank while LGD and EAD are prescribed by regulator.
Advanced IRB
PD, LGD, and EAD can be estimated internally by the bank itself.
Effective Maturity (M)
It is a duration that reflects standard bank practice is used. For Foundation IRB, the effective maturity is 2.5 years (exception is repo style transactions where it is 6 months). For Advanced IRB, M is the greater of 1 year or the effective maturity of the specific instrument.

Basel III

Basel III accord has recently become effective starting 2019. In some countries, central banks have fixed Dec'2019 as the deadline to meet capital requirements under the Basel III norm. Basel III has incorporated several risk measures to counter issues which were identified and highlighted in 2008 financial crisis. It emphasis on revised capital standards (such as leverage ratios), stress testing and tangible equity capital which is the component with the greatest loss-absorbing capacity.

The concept of building internal models and external ratings for estimating PD, LGD and EAD remains same as it was in Basel II. However there are some changes introduced in Basel III. It is shown in the table below.
Basel II Basel III
Common Tier 1 capital ratio(shareholders’ equity + retained earnings) 2% * RWA 4.5% * RWA
Tier 1 capital ratio 4% * RWA 6% * RWA
Tier 2 capital ratio 4% * RWA 2% * RWA
Capital conservation buffer(common equity) - 2.5% * RWA


IFRS 9 is is an International Financial Reporting Standard dealing with accounting for financial instruments. It replaces IAS 39 Financial Instruments which was based on the incurred loss model whereas IFRS 9 focuses on the expected loss model that covers also future losses.
In IFRS 9, the idea is to recognize 12-month loss allowance at initial recognition and lifetime loss allowance on significant increase in credit risk
As per IFRS 9, there are three stages of Credit Risk which are as follows -
  1. Stage 1 - Credit risk has not increased significantly since initial recognition, indicates low credit risk at reporting date
  2. Stage 2 - Credit risk has increased significantly since initial recognition
  3. Stage 3 - Permanent reduction in the value of financial asset at the reporting date

How IFRS 9 is different from Basel III?

Yes, they are different but both requires building PD, LGD and EAD models. See the difference between them below.
Parameters Basel III IFRS 9
Objective Expected + Unexpected Loss Expected Loss
PD One year PD 12 month PD for stage 1 assets, Lifetime PD for stage 2 and 3 assets
Rating Philosophy TTC rating philosophy PIT rating philosophy
LGD Downturn LGD (both direct + indirect costs) Best estimate LGD (only direct costs)
EAD Downturn EAD Best estimate EAD
Expected Loss /Expected CreditLoss (ECL) EL=PD*LGD*EAD EL=PD*PV of cash shortfalls

What is Credit Risk Modelling?

Credit risk modeling refers to data driven risk models which calculates the chances of a borrower defaults on loan (or credit card). If a borrower fails to repay loan, how much amount he/she owes at the time of default and how much lender would lose from the outstanding amount. In other words, we need to build probability of default, loss given default and exposure at default models as per advanced IRB approach under Basel norms.

Probability of Default Modeling

In this section, we covered various steps and methods related to PD modeling.

Define Dependent Variable

Binary variable having values 1 and 0. 1 refers to bad customers and 0 refers to good customers.

Bad Customers Customers who defaulted in payment. By 'default', it means if either or all of the following scenarios have taken place.

  • Payment due more than 90 days. In some countries, it is 120 or 180 days.
  • Borrower has filed for bankruptcy
  • Loan is partially or fully written off

Indeterminates or rollovers These customers fall into these 2 categories :

  • Payment due 30 or max 60 days but paid after that. They are regular late payers.
  • Inactive accounts

All the other customers are good customers.

Indeterminates should not be included as it would reduce the discrimination ability to distinguish between good and bad. It is important to note that we include these customers at the time of scoring.

We consider 12 months as performance window to flag defaults which means if a customer has defaulted any time in next 12 months, it would be flagged as 'Bad'

Methodologies for Estimating PD

There are two main methodologies for estimating Probability of Default.
  1. Judgmental Method
  2. Statistical Method
Judgmental Method
It relies on the knowledge of experienced credit professionals. It is generally based on five Cs of the applicant and loan.
  • Character : Check credit history of borrower. If no credit history, bank can ask for referees who bank can contact to know about the reputation of borrower.
  • Capital : Calculate difference between the borrower’s assets (e.g., car, house, etc.) and liabilities (e.g., renting expenses, etc.)
  • Collateral : Value of the collateral (security) provided in case borrower fails to repay
  • Capacity : Assess borrower’s ability to pay principal plus interest amount by checking job status, income etc.
  • Conditions includes internal and external factors (e.g. economic recession, war, natural calamities etc.)
Judgmental methods have become past as Statistical methods are more popular these days. But it is still widely used when historical data is not available (especially new credit products).
Statistical Method
In today's world, nobody has time to wait for 1-2 months to know about the status of loan. Also many borrowers apply for loan through bank's website. Hence real-time credit decisions by bank is required to remain competitive in the digital world. The advantage of using statistical method is that it produces mathematical equation which is an automated and faster solution for making credit decisions.

This method is unbiased and free from dishonest or fraudulent conduct by loan approval officer or manager.

This method also comes with higher accuracy as statistical and machine learning models considers hundreds of data points to identify defaulters.

Data Sources for PD Modeling

  • Demographic Data : Applicant's age, income, employment status, marital status, no. of years at current address, no. of years at job, postal code
  • Existing Relationship : Tenure, number of products, payment performance, previous claims
  • Credit Bureau Variables : Default or Delinquency history, Bureau score, Amount of credits, Inquiries etc.

Steps of PD Modeling

  • Data Preparation
  • Variable Selection
  • Model Development
  • Model Validation
  • Calibration
  • Independent Validation
  • Supervisory Approval
  • Model Implementation : Roll out to users
  • Periodic Monitoring
  • Post Implementation Validation : Backtesting and Benchmarking
  • Model Refinement (if any issue)

Statistical Techniques used for Model Development

  • Logistic Regression is most widely used technique for estimation of PD
  • Survival Analysis is generally used to compute lifetime PD (required for IFRS 9)
  • Random Forest
  • Gradient Boosting
  • Markov chain Modeling
  • Neural Network

Model Performance in PD Model

There are main 2 levels of performance testing -
  1. Discrimination : Ability to differentiate between good (non-defaulters) and bad (defaulters) customers
  2. Calibration : Check whether the actual default rate is close to predicted PD values
Statistical Tests for Model Performance
Discrimination : Area under Curve, Gini coefficient, KS Statistics
Calibration : Hosmer and Lemeshow Test, Binomial Test
Check out this link for detailed explanation : Model Performance Simplified

Rating Philosophy

It refers to the time horizon for which ratings measure credit risk and how much they are influenced by cyclic effects.
Point in time (PIT) PD
  • It evaluates the chances of default at that point in time. It considers both current macro-economic factors and risk attributes of borrower.
  • Since it captures current macro-economic factors so PIT PD moves up as macro-economic conditions deteriorate and moves down as macro-economic conditions improve.
  • It focuses on reporting date
  • IFRS 9 requires PDs to be Point in time
Through the cycle (TTC) PD
  • It predicts average default rate over an economic cycle and ignores short run changes to a customer's PD and closely resembles long-term average default rate.
  • Grade assigned is not dependent on current macro-economic factors
  • It focuses on long-run average PD
  • Basel III requires PDs to be Through the cycle

In general, hybrid model (considering both PIT and TTC) is used.

rating philosophy

Credit Scoring and Scorecard

Probability of Default model is used to score each customer to assess his/her likelihood of default. When you go to Bank for loan, they check your credit score. This credit score can be built internally by bank or Bank can use score of credit bureaus.

Credit Bureaus collect individuals' credit information from various banks and sell it in the form of a credit report. They also release credit scores. In US, FICO score is very popular credit score ranging between 300 and 850. In India, CIBIL score is used for the same and lie between 300 and 900.

Types of Scorecards
1. Application Scorecard : It applies to new (first time) customers applying for loan or credit card. It estimate probability of default at time applicant applies for loan. See the example below how it works.
How scorecard works
Suppose cutoff for granting loan = 350

Profile of a New Customer
Age         30
Gender Male
Salary 15000

Total Points = (100 + 85 + 120) = 305
Decision : Refuse Loan
Data required for application scorecard
We use customer's application or demographic data along with credit bureau data. There is no observation window for historical data as these are new customers. Definition of Bad is same which is 90+ days past due. Performance window is generally 12 to 24 months from opening account.

Application scorecard is used majorly for the following tasks:

  • To determine whether or not to approve a customer for a loan.
  • To assist in 'due diligence'. Suppose an applicant scoring very high or very low can be declined or approved outright without asking for further information.

2.Behavior Scorecard : It applies to existing customers to assess whether customer will default in loan payment. Performance window is generally 6 to 18 months.

Behavior scorecard is used majorly for the following tasks:

  • To set credit limit i.e. increase or decrease credit limit
  • Debt provisioning and profit scoring.
  • Renewals

Difference between Application and Behavior Scorecard
Application scorecard is applied on new customers (generally lower than 1 year) whereas Behavior scorecard is applied on existing customers (greater than 1 year). For application scorecard, we don't require well-calibrated default probabilities. But calibrated default probabilities are required for behavior scorecard as per Basel norms. These two scorecards are also different in terms of usage. See the explanation above in their respective section how they are generally used.
Collections Scoring
It predicts probability that a loan already late for a given number of days will be late for another given number of days. They are typically built for performance windows of one month.
Desertion Scoring
It predicts the probability a borrower will apply for a new loan once the current loan is paid off.

Important Terminologies related to Credit Risk

Stressed PD vs. Unstressed PD
Stressed PD: A stressed PD depends on the risk attributes of borrower but is not highly affected by macroeconomic factors as adverse economic conditions are already factored into it.

Unstressed PD: An unstressed PD depends on both current macroeconomic and risk attributes of borrower. It moves up or down depending on the economic conditions.

Downturn LGD and EAD
Under Basel II and III, financial institutions need to estimate downturn LGD and EAD. By 'downturn', it means adverse economic conditions. We need to select the month with highest default rate and then take two consecutive quarters (6-month) window on both sides of this point and consider it as downturn period and then take maximum of EAD and LGD which provides the downturn estimates. It is required because LGD and EAD can be affected by downturn economic conditions.
Conditional PD
It is the probability of default during the second year given that it does not default during the first year. To calculate conditional PD, we need probability of not defaulting by the end of year 1 (P0) and unconditional probability of defaulting during the second year (P1). If P0=0.5 and P1=0.1 so Conditional PD i.e. Prob(default | Survival) would be 0.1/0.5 = 20%

Lifetime PD vs 12 month PD

As per IFRS 9, we require two types of PDs for calculating expected credit losses (ECL).
  • 12-month PDs for stage 1 assets - Chances of default within the next 12 months
  • Lifetime PDs for stage 2 and 3 assets - Chances of default over the remaining life of the financial instrument.
Suppose 12-month PD is 3% which means survival rate is 97% (1 - PD). 2nd and 3rd year conditional PD is 4% and 5%.
1st year cumulative survival rate (CSR) is same as first year survival rate (SR).
2nd year cumulative survival rate = 1st year CSR * SR of 2nd year = 97% * 96% = 93%
3rd year cumulative survival rate = 2nd year CSR * SR of 3rd year = 93% * 95% = 88%
Lifetime PD = 1 - 88% = 12%

Macroeconomic factors to consider to estimate ECL

Unemployment rate
Index of Industrial Production
Interest rate
Inflation rate
House price index
Exchange rate

Softwares used in risk analytics

Let's split this section into two parts -

1. Data Extraction
Most of the data is stored in relational databases (SQL Server, Teradata). Analyst need to have expert level knowledge of SQL to extract or manipulate data. Data is not saved in a single SQL table or database. In order to extract relevant data fields from database, you need to select multiple tables and join them based on matching key(s). During this process, you need to apply some business rules (excluding some type of customers or accounts). Transaction table is generally in mainframe environment so basic knowledge of mainframe and UNIX would be key. Mainframe and UNIX are not primary skill sets banks generally look for in risk analyst (It's good to have!). Developers are generally hired for this work.

2. Model Building
SAS is the most widely used software in risk analytics. Despite huge popularity of R and Python these days, more than 90% of banks and other financial institutions still use SAS. Banks also started exploring R and Python. They are building (or already built) syntax library (repository) in R and Python language for credit risk projects.

SAS can be easily integrated with relational databases and mainframe. Many companies execute both data extraction and model building steps in SAS environment only.

End Note
Hope you have got a fair idea of how predictive modeling is used in credit risk domain and what are the key credit risk parameters. In risk analytics, domain knowledge is more important than technical or statistical knowledge. Hope this article helped you in filling that gap. Please provide your feedback in the comment box below.
Spread the Word!
Related Posts
About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he has worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and Human Resource.

17 Responses to "A Complete Guide to Credit Risk Modelling"
  1. Hi Deepanshu really very informative for beginner's like me....
    Can you please example of how behavior score card can be used to set credit Limit

    1. Behaviour score generated based on customer history(Transaction, Delinquent,overlimit or past due or loan defaulter or credit card credit limit utilization. So in that case if BEH score is good that means, He/she is a good customer. So bank can use this beh score range and can increase credit limit

  2. Hi Deepanshu
    could you explain the risk weight and how will they set the threshold

  3. Amazing ! Im working in credit risk reporting and I haven't yet come across such a concise and clear theoretical background

  4. Very useful content..I have been working in banking sector last 5 years, but still it clarified few concept for me..kudos to you for enlighten people.

  5. In credit risk we "snapshot" and "vintage" are commonly used. Is there any difference between snapshot and vintage or are these used interchangeably?

  6. Good article, can you please provide pd, lgd models procedure end to end

  7. Thanks! The info is very well organized, informative and easy to follow!

  8. Informative and easy to understand

  9. Could you reflect on how to convert a facility level TTC PD to PIT PD?
    Let's say facility is of 5 year maturity. From a given TTC PD, X % how do we arrive at yearly break of PIT PD?

  10. It's truly a guide. Thanks so much.

  11. Pls. When was this article published?
    I need to reference in my work. Thank you.

  12. Good article but the title is misleading - a better title would be "Very Preliminary Introduction to Credit Risk Modelling".

  13. Thnkyou for sharingg!! its really helpful


Next → ← Prev
Love this post? Support Us!
Buy Me A Coffee