Skip to content
View Ayrie741's full-sized avatar

Highlights

  • Pro

Block or report Ayrie741

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Ayrie741/README.md

ZHEMIN_XIE

Hi, I’m Zhemin Xie(Ayrie) 👋

🎓 BSc Data Science & AI @ Leiden University | 📍 Leiden (open to relocate)


📝 About me

  • 🛠️ Tech stack: Python · SQL · R · scikit-learn · pandas · NumPy · statsmodels · Power BI · Tableau · Excel · Git
  • 📊 Direction: Data Cleaning & Visualization, Machine Learning Modeling, Risk Analytics, Business Analytics,Data Angineering

💼 Education

  • BSc. Data Science & Artificial Intelligence Leiden University, NL | Sep 2023 – Jun 2026
  • MBA in Professional Accounting
    Rutgers University, USA | May 2019 – Oct 2020
  • BA Business Administration
    Beijing Normal–Hong Kong Baptist University, CN | Sep 2014 – Jun 2018

🏆 Projects

Purchase Prediction Model

🐍 Python · pandas · NumPy · scikit-learn

  • Exploratory analysis and visualization of datasets
  • RFM Feature Engineering + Full Process scikit-learn Pipeline(StandardScaler → OneHotEncoder → LogisticRegression)
  • Deploying as a Jupyter Notebook Demo

Bank Customer Churn Prediction

🐍 Python · pandas · scikit-learn · SHAP · Matplotlib · Seaborn

  • Built a leakage-aware customer churn prediction pipeline for bank customer retention analysis
  • Diagnosed unrealistic near-perfect model performance caused by a complaint-related feature and rebuilt the model under a more realistic setting
  • Compared Logistic Regression, Random Forest, and Gradient Boosting models using precision, recall, F1, ROC-AUC, and PR-AUC
  • Applied threshold tuning, cost-sensitive analysis, and customer segmentation to translate model outputs into retention strategy insights

Bank Card Fraud Detection Analysis

🐍 Python · pandas · NumPy · Power BI

  • Pre-processing of high-powered credit card transaction data
  • Anomaly Detection Model Training and Evaluation
  • Visual dashboards to show anomaly trends

Credit Risk Scorecard for Loan Default Risk

🐍 Python · pandas · NumPy · scikit-learn · statsmodels · matplotlib · seaborn

  • Built an interpretable credit risk scorecard using LendingClub loan data with manually implemented WOE, IV, PSI, and score scaling
  • Designed a leakage-aware modeling pipeline by removing post-origination variables and using time-based train / validation / test splitting
  • Trained a logistic regression scorecard with 0.663 test ROC-AUC, 0.412 PR-AUC, and 0.234 KS on out-of-time test data
  • Converted predicted default probabilities into credit scores and risk bands, showing around 5x bad-rate difference between highest-risk and lowest-risk bands

Retail Demand Forecasting with Hierarchical Time Series

🐍 Python · pandas · NumPy · Darts · scikit-learn · Matplotlib

  • Built a 561-component hierarchical retail forecasting pipeline across total, store, item, and store-item sales levels
  • Compared Seasonal Naive and Linear Regression models, reducing Total MAPE from 34.98% to 6.80%
  • Applied forecast reconciliation, with Top-Down improving Store-item MAPE from 20.05% to 15.50%
  • Identified high-risk store-item demand segments to support inventory replenishment analysis

Profit-Aware Price Optimization with Demand Modeling

🐍 Python · pandas · NumPy · scikit-learn · Matplotlib · Jupyter

  • Built a profit-aware pricing optimization workflow using 2,800 historical sales records
  • Compared Ridge, Random Forest, and Gradient Boosting demand models, selecting Random Forest with around 25.60% SMAPE
  • Recommended product-level prices under observed price ranges and a ±15% price-change guardrail
  • Identified major pricing opportunities for Carretera, Paseo, and Velo based on predicted profit uplift
  • Added segment-level pricing, discount simulation, robust optimization, and A/B test rollout planning

📫 Contacts

Thanks for visiting! 🚀

Pinned Loading

  1. Bank-card-fraud-detection-analysis Bank-card-fraud-detection-analysis Public

  2. Prediction-of-purchasing-behavior Prediction-of-purchasing-behavior Public

    Jupyter Notebook