Thanks for your interest in contributing. GraphNetz is a research-grade benchmarking framework, so the bar for new code is correctness, statistical honesty, and clarity — in that order.
- No silent baselines. Every cell in
BENCHMARK_TASKSmust have a real held-out metric (test accuracy, test AUC, validation MAE). Self-supervised losses are not benchmark metrics; usetrain_dgi/DGIWrapperas a pre-training utility instead. - Statistics first. New evaluation paths must thread through the multi-seed pipeline so the report still produces per-cell CIs, Holm-corrected pairwise tests, and Friedman–Nemenyi diagrams without bespoke code.
- Determinism. Seed every RNG (
torch,numpy, Pythonrandom). A run with the same seed list and software stack must reproduce bit-for-bit on the same hardware. - Small, focused PRs. One loader, one model, or one bug per PR. Keep unrelated reformatting out.
git clone https://github.com/quant-sci/graphnetz
cd graphnetz
uv sync --group dev # install + dev dependencies
uv run pytest # smoke tests
uv run ruff check # lint
PYTHONPATH=src uv run python examples/experiment.py # regenerate paper figures- Pick the right category module under
src/graphnetz/datasets/(or open an issue if a new category is needed — the taxonomy is intentionally small). - Add a thin loader function that returns a PyG dataset. Keep it stateless,
one network per call. Examples in
social.pyandbiology.py. - Register it in
LOADER_REGISTRY(insrc/graphnetz/datasets/__init__.py) under each task it can serve. A single loader may appear under multiple tasks (Cora is bothnode_clsandlink_pred). - If the loader is appropriate for the curated benchmark, add a
Task(...)toBENCHMARK_TASKSinsrc/graphnetz/benchmark.pyand pick an epoch budget that converges on a laptop. - Add a one-line entry in
tests/test_smoke.pyso the loader is exercised in CI.
The benchmark dispatches by task, not by model name, so models declare which tasks they support.
from graphnetz import register_model
@register_model(task_type={"node_cls", "graph_cls"})
class MyGNN(torch.nn.Module):
def __init__(self, in_channels, hidden_channels, out_channels):
...The default factory calls cls(in_channels, hidden_channels, out_channels).
For non-standard signatures, pass a factory= callable to register_model.
If the model is a node-level encoder that should also work as a
graph-classifier, regressor, or link-predictor, prefer wrapping it with
graphnetz.benchmark._multi_task_factory rather than maintaining a separate
implementation per task.
Adding a new task type (e.g. node_reg, temporal) is a four-step change:
- Append it to
TASK_TYPESinbenchmark.py. - Add a training routine in
training.pyreturning a per-epoch metric dict. - Add an adapter in
models/_adapters.pyif node-level encoders should plug into the new task via the multi-task factory. - Extend
_run_taskinbenchmark.pywith the dispatch branch.
Document the new task in the README's Task table.
Stay in BenchmarkReport (benchmark.py). New tests should:
- Operate on the per-seed
final_metrics()tensor, not on training loss. - Return a structured object (DataFrame / dict) and a LaTeX export method.
- Use the closed-form null distributions in
scipy.statsrather than bootstrap simulations unless the paired-by-seed structure makes the bootstrap clearly preferable.
- Python 3.10+; type hints on every public function.
ruffis the source of truth for lint; PRs must beruffclean.- No comments that explain what the code does — only why a non-obvious choice was made.
- Docstrings on public symbols only. One-line summary, optional body, no multi-paragraph essays.
- Tests under
tests/. Smoke tests are fine for new loaders; full coverage is required for new statistical helpers.
PYTHONPATH=src uv run python paper/experiment.py
latexmk -pdf paper/main.texIf you change BENCHMARK_TASKS or any default in benchmark.py, please also
re-run paper/experiment.py so the cached histories
(paper/_cache_*.pkl), figures, and tables stay in sync with the prose.
Please include:
- Minimal reproducer (
python -c "..."is best). python --version,pip freeze | grep -E "torch|geometric|graphnetz".- The full traceback, not just the last line.
Security-sensitive issues: open a private issue or email the maintainer
listed in pyproject.toml instead of a public issue.
Be specific, attack the code not the person. The maintainers reserve the right to close threads that drift outside that.