feat(karthik_model): neural network training on NBA CSV data by Kravi001 · Pull Request #13 · JonathanPLev/TransformerPredictionModel

Kravi001 · 2026-02-10T08:34:34Z

What I changed

Added karthik_model.py in src/mini_nn/ to train a neural network on NBA CSV data.
Loads data from Data/PlayerStatistics.csv, preprocesses features, handles NaNs, and manages class imbalance.
Trains a model to predict whether a player scores ≥20 points and saves predictions (gitignored).

Why I changed it

Provides a fully working neural network pipeline for NBA CSV data.
Allows generating predictions without connecting to a database.

How did I test it

Ran the script locally on the full dataset (1,655,736 rows, 19 features).
Observed class imbalance: 0 → 0.8699, 1 → 0.1301; positive class weight = 6.69.
Training metrics:
- Epoch 10 | Train Loss: 0.1343 | Val F1: 0.9847
- Epoch 20 | Train Loss: 0.0942 | Val F1: 0.9400
- Epoch 30 | Train Loss: 0.0690 | Val F1: 0.9919
- Epoch 40 | Train Loss: 0.0624 | Val F1: 0.9946
- Epoch 50 | Train Loss: 0.0418 | Val F1: 0.9990
- Epoch 60 | Train Loss: 0.0330 | Val F1: 0.9997
- Epoch 70 | Train Loss: 0.0264 | Val F1: 1.0000
Early stopping triggered at epoch 72.
Test results:
- Best Epoch: 57
- Test Accuracy: 0.9999
- Test F1 Score: 0.9997
- Test ROC-AUC: 1.0000

JonathanPLev

can you add your results? curious to know how well this model did.

edit: sorry i see it actually. how are you getting 100% test accuracy? I think you are having leaking or overfit data.

JonathanPLev · 2026-02-10T17:44:00Z

+    # -------------------------
+    X_temp, X_test, y_temp, y_test = train_test_split(X, y, test_size=0.15, random_state=42)
+    X_train, X_val, y_train, y_val = train_test_split(X_temp, y_temp, test_size=0.176, random_state=42)
+


you are definitely having leaking data here because you are having random split

JonathanPLev · 2026-02-10T17:46:06Z

+    # -------------------------
+    # SPLIT
+    # -------------------------
+    X_temp, X_test, y_temp, y_test = train_test_split(X, y, test_size=0.15, random_state=42)


youre also not dropping any of the lines that leak data to the outcome. like youre not removing any stats like points or anything thats current game.

Kravi001 requested a review from JonathanPLev as a code owner February 10, 2026 08:34

feat(mini_nn): add karthik_model.py for NBA CSV neural network

1cbbb6d

Kravi001 force-pushed the karthik-neural-networks branch from 22dc60f to 1cbbb6d Compare February 10, 2026 08:36

JonathanPLev changed the title ~~Add karthik_model.py: neural network training on NBA CSV data~~ feat(karthik_model): neural network training on NBA CSV data Feb 10, 2026

JonathanPLev requested changes Feb 10, 2026

View reviewed changes

JonathanPLev reviewed Feb 10, 2026

View reviewed changes

JonathanPLev requested changes Feb 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(karthik_model): neural network training on NBA CSV data#13

feat(karthik_model): neural network training on NBA CSV data#13
Kravi001 wants to merge 1 commit into
mainfrom
karthik-neural-networks

Kravi001 commented Feb 10, 2026

Uh oh!

JonathanPLev left a comment •

edited

Loading

Uh oh!

JonathanPLev Feb 10, 2026

Uh oh!

JonathanPLev Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Kravi001 commented Feb 10, 2026

What I changed

Why I changed it

How did I test it

Uh oh!

JonathanPLev left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JonathanPLev Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

JonathanPLev Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JonathanPLev left a comment •

edited

Loading