GitHub - dkittelson/aurelius

Datasets Elliptic Dataset:

Phase 1: Data Preprocessing & Graph Construction builder.py:

Reads CSVs
Assigns each transaction index (0,1,2,..)
Creates tensor for features of every transaction node x = [203k nodes, 166 features]
Creates edge_index tensor of shape [2, num_edges] -- "node A connects to node B"
Creates labels y -- 0 for licit, 1 for illicit, -1 for unknown
Creates temporal masks --> timesteps 1-34 (train), 35-42 (val), 42-49 (test)

features.py:

PageRank: How important is a node based on its connections --> hub accounts used for layering have high PageRank
Degree centrality: How many direct connections --> smurfing (many small deposits into one account)
Betweenness centrality: How often a node sits between others --> layer accounts are on path between dirty source and clean destination
Closeness centrality: How quickly a node can reach all others --> highly connected laundering hubs reach all others quickly
Local clustering coefficient: Are a node's neighbors connected to eachother? --> legit businesses cluster

Phase 2: GNN Model & Training gnn_model.py:

Attention Heads: Each layer runs 4 independent "attention heads" --> each learns different way to weigh node's neighbors
Jumping Knowledge: Concatenate outputs from all 3 layers (layer 1: 1-hop neighbors; layer 2: 2-hop neighbors) --> local and long-range signals
Residual Connections: Each layer, add input back to the output --> prevents gradients from vanishing

classifier.py:

Takes 384-dim GNN embedding per node, concatenates it with og 166+ features, giving a 550-dim tabular row
XGBoost learns feature splits GNN can't learn (sharp dollar thresholds, exchange rate ratios)

train.py:

Each training step loads 2048 nodes, then samples up to 15 neighbors per seed at hop 1, 10 at hop 2, 5 at hop 3
early stops after 20 epochs if no improvement

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.claude		.claude
docker		docker
finished		finished
frontend		frontend
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile.api		Dockerfile.api
Dockerfile.frontend		Dockerfile.frontend
README.md		README.md
config.yaml		config.yaml
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback