A data analytics project that studies loan default behaviour across different borrower profiles. Using SQL, Power BI, and Excel/CSV, this project analyzes which customer segments are more likely to default and builds an interactive dashboard to support better credit risk decisions.
The objective of this project is to understand patterns behind loan defaults and identify key risk factors such as income level, loan amount, credit-to-income ratio, employment status, and demographics. The dashboard helps credit teams quickly see which applicants are high risk and how defaults are distributed across segments.
Lending institutions need to balance growth with risk. Approving too many risky loans increases default losses, while being too strict reduces revenue and customer acquisition. This project answers questions like:
- Which borrower groups have the highest default rates?
- How do income, loan amount, and EMI-to-income ratio affect default risk?
- Are certain occupations, age groups, or regions more likely to default?
- What profile characteristics are common among good vs. bad borrowers?
- SQL — data extraction, joins, aggregations, risk calculations
- Power BI — interactive risk dashboard and visual storytelling
- DAX — measures for default rate, approval rate, and risk scores
- Excel / CSV — data storage and initial cleaning
The dataset contains historical loan application records with fields such as:
Application_IDCustomer_IDAgeGenderIncomeLoan_AmountTenureEMIorInstallmentCredit_to_Income_RatioEmployment_TypeMarital_StatusRegionDefault_Flag(0 = No Default, 1 = Default)
A detailed description of the dataset columns can be found in dataset/data_description.md (add or rename this file if needed).
loan-default-risk-analysis/
├── README.md
├── dataset/
│ ├── borrower_profiles.csv
│ ├── loan_applications.csv
│ └── data_description.md
├── dashboard/
│ └── Loan_Default_Risk_Dashboard.pbix
├── queries/
│ └── loan_default_analysis.sql
└── screenshots/
├── overview.png
├── risk_by_segment.png
├── income_vs_default.png
└── region_default_rate.png
Key SQL analyses performed in this project include:
- Overall default rate across all loans
- Default rate by income band, age group, and employment type
- Relationship between
credit_to_income_ratioand default - Default rate by loan amount and tenure
- Region-wise distribution of good vs. bad loans
All important queries are stored in queries/loan_default_analysis.sql.
The Power BI dashboard provides:
- KPI cards for total applications, total approved loans, default rate, and high‑risk share
- Overview page showing good vs. bad loans and key summary metrics
- Risk by segment: default rate by income band, age group, employment type, and marital status
- Income vs. default: charts linking income, EMI burden, and default behaviour
- Region-wise risk map or bar chart showing where default is more common
- Slicers to filter by time period, region, employment type, and income band
Example insights you can refine based on your data:
- Borrowers with high credit-to-income ratio show significantly higher default rates.
- Certain income bands and age groups are more prone to default than others.
- Self‑employed or unstable employment categories have a higher share of bad loans.
- Some regions show consistently higher default rates, indicating location-based risk.
Update these bullets with your real numbers once you review your dashboard.
This analysis and dashboard can help lenders:
- Identify risky applicant profiles before loan approval
- Design better credit policies and eligibility rules
- Adjust interest rates or credit limits based on risk level
- Focus collections efforts on segments with high probability of default
- Support more transparent, data-driven lending decisions
- Download the dataset from the
datasetfolder. - Open the
Loan_Default_Risk_Dashboard.pbixfile from thedashboardfolder in Power BI Desktop. - If needed, update the data source paths to point to your local CSV files.
- Refresh the data and explore the visuals using the available filters.
- Data cleaning and preparation for risk modelling
- SQL joins, aggregations, and calculated fields for risk metrics
- Building interactive dashboards in Power BI
- Creating DAX measures for default rate and risk segmentation
- Communicating credit risk insights to non-technical stakeholders
Subhankar Das
Aspiring Data Analyst | SQL | Power BI | Risk Analytics
- GitHub: subhankar-das18
- LinkedIn: www.linkedin.com/in/subhankar-das-the-analyst
Feedback and suggestions are welcome through issues or pull requests.



