Skip to content

sofiatil/credit-risk-classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Credit Risk Classifier: Default Prediction & Binary Outcome

📌 Project Overview

This repository contains a machine learning project that analyzes the "Default of Credit Card Clients" dataset to predict whether a bank customer will fail to pay their debt next month.

In credit risk modeling, prioritizing overall accuracy can leave a financial institution vulnerable to massive losses. This project focuses on a business-first data approach: minimizing financial risk by catching as many potential defaulters as possible (maximizing Recall/minimizing Type II errors), while maintaining an optimal balance with overall model accuracy.


📊 Dataset Information

The project analyzes data from 30,000 credit card clients.

  • Features Used: Demographic data (Age, Education, Marriage), Credit Limits (LIMIT_BAL), Billing Amounts (BILL_AMT), and historical repayment statuses (PAY_X).
  • Target Classification: Class 0 (Safe/No Default) and Class 1 (Default). Statistical modeling treats the significant risk event—the Default (1)—as the positive class.

🛠️ Key Technical Insights & Pipeline

1. The Critical Impact of Feature Scaling

A key experiment was conducted to evaluate how distance-based algorithms compare against tree-based algorithms when handling unscaled vs. scaled data:

  • Distance-Based Models (Highly Sensitive): * Logistic Regression: With unscaled data, it failed entirely to detect defaults (F1-Score: 0.0) because the large scale of LIMIT_BAL overwhelmed behavioral features like PAY_0. Standardizing the data allowed the model to converge, achieving an F1-Score of 0.36.
    • K-Nearest Neighbors (KNN): Performance nearly doubled after scaling, with the F1-Score jumping from 0.24 to 0.42. Scaling prevented the distance calculations from being dominated entirely by the credit limit over age and payment history.
  • Tree-Based Models (Scale-Invariant): Random Forest performance remained virtually identical across unscaled (F1: 0.4714) and scaled data (F1: 0.4717), proving that trees evaluate the order of numbers rather than their absolute scale.

2. Feature Correlation & Mitigating Noise

An analysis of the feature relationships revealed significant multicollinearity among the financial billing variables (BILL_AMT1 to BILL_AMT6), creating a high-redundancy block in the correlation matrix. This violated the independence assumptions of simpler models like Naive Bayes. To stabilize modeling, the feature space was isolated down to the strongest behavioral predictors: the historical repayment tracking variables (PAY_X).


📈 Model Comparison & Selection

Optimization focused heavily on two contrasting models: Gaussian Naive Bayes (for its high baseline sensitivity) and Random Forest (for its structural robustness).

  • Why Naive Bayes was Rejected: While it achieved a high baseline recall of 0.63, it suffered from a lower overall accuracy of 71% (falsely flagging too many safe clients). Furthermore, threshold testing proved it was too inflexible for this data structure; dropping the decision boundary to 0.3 leave the recall stagnant around 41–44%, showing it lacked the complexity to capture subtle risk profiles.
  • Why Random Forest was Selected: Random Forest demonstrated significantly stronger predictive power with an initial ROC-AUC score of 0.786 (compared to Naive Bayes' 0.727). It offered an excellent mathematical foundation for probability threshold tuning to consciously trade off standard accuracy for bank safety.

🎛️ Business Threshold Optimization

Standard machine learning models default to a classification threshold of 0.5, which optimizes default accuracy but can be dangerous in risk management. To align with the business goal of minimizing catastrophic Type II errors (predicting a client is safe when they actually default), the decision threshold was heavily experimented with:

  • Default Threshold (0.50): Good overall accuracy (78.5%), but highly risky as it missed nearly half of all defaults (Recall: 56.5%).
  • Aggressive Safety Threshold (0.30): Caught almost all defaults (Recall: 87.5%), but generated excessive false alarms, decimating bank accuracy down to 55%.
  • Optimal Business Threshold (0.40): Selected as the ideal operational sweet spot. It captured the vast majority of risky customers (Recall: 69.0%) while preserving a reliable baseline of operational efficiency (Accuracy: 72.1%).

Final Model Performance (Threshold @ 0.40)

  • Total Samples Evaluated: 5,000 clients
  • True Negatives (Correctly identified as Safe): 3,403
  • True Positives (Correctly caught Defaulting): 924
  • False Negatives (Type II Errors - Missed Defaults): 415
  • False Positives (Type I Errors - False Alarms): 1,258

🛡️ Validation & Stability

To confirm these results were mathematically stable and not a byproduct of a fortunate train/test split, a 5-fold Cross-Validation was performed.

  • Average Cross-Validated AUC: 0.7779
  • Standard Deviation: 0.0046

The exceptionally low variance confirms that the tuned Random Forest classifier is highly robust and generalizes reliably across different subsets of banking data.

About

In this assignment, I analyzed the "Default of Credit Card Clients" dataset to build a model that is able of predicting whether a customer will fail to pay their debt next month. My primary goal was not only to achieve high accuracy, but to solve the specific business problem.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors