Conversation
Implements lembas_network(), lembas_ligands(), lembas_tfs(), lembas_annotation() and lembas_datasets() to fetch the macrophage and ligand screen datasets used in Nilsson et al. 2022 (Nat Commun). Macrophage files are pulled from Zenodo (record 10815391); ligand screen files from the Lauffenburger-Lab/LEMBAS GitHub repo. Also registers both datasets in datasets.yaml and adds the LEMBAS section to api.rst and datasets.rst. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the full LEMBAS-RNN method (MML activation, steady-state convergence, uniform regularisation) alongside ridge and mean-response baselines. Adds lembas_format_network to utils, wires the new methods module, updates API and narrative docs, and adds a pytest smoke suite. Co-Authored-By: daniele-bottazzi <daniele-bottazzi@users.noreply.github.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…h optional dep
evaluate_predictions gains an axis parameter ('readout' | 'condition') so
users can inspect which TFs or which experimental conditions are predicted
poorly. Adds three new tests covering both axes and bad-axis validation.
Tutorial notebook C_lembas.ipynb walks through the full macrophage pipeline:
data loading, network formatting, train/test split, mean/ridge/LEMBAS-RNN
models, and per-readout and per-condition evaluation. Registers torch as an
optional dependency installable via pip install networkcommons[torch].
Co-Authored-By: daniele-bottazzi <daniele-bottazzi@users.noreply.github.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- use uvx migrate-to-uv (translate automatically from poetry to uv) - use uvx pyproject-fmt (reformat the pyproject)
- Relax over-constrained dependency bounds; add lower bounds to unconstrained deps - Restructure dependency-groups into test/docs/lint/dev sub-groups - Replace black/isort/flake8/yapf/pyupgrade with ruff; switch to Google docstring convention - Rewrite tox.ini for uv (tox-uv, dependency_groups); rewrite CI workflows to use astral-sh/setup-uv - Drop legacy artifacts: setup.py, environment.yml, docs/src/requirements.txt - Consolidate docs deps into pyproject.toml; update .readthedocs.yaml to use uv sync - Fix _metadata.py to read from [project] instead of [tool.poetry] - Remove stale poetry references from notebook and installation docs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Switch build backend from uv_build to hatchling - Fix _metadata.py: replace deprecated toml with tomllib/tomli (stdlib 3.11+, backport for 3.10) - Upgrade corneto 1.0.0a0 → >=1.0.0b7 (drops numpy<2 cap) - Upgrade omnipath >=1.0.8 → >=1.0.12 (fixes np.NAN removed in numpy 2) - Remove numpy<2 upper bound (no longer needed) - Pin pypath-omnipath to saezlab/pypath git master: fixes module-level RaMP API call crashing json.loads when the server is unreachable (issue #318) - Add pypath-omnipath[curl] extra to bring pycurl back (now optional in pypath) - Restructure pixi environments: add feature-level pypi-dependencies with extras so each environment activates the right optional deps; dev env now includes igraph, torch, corneto-backends and pygraphviz Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Corneto 1.0.0b7: - Use cn.Graph (public API) instead of cn._graph.Graph in utils.py, _network.py and test_utils.py; beta moved the class to a new module - Workaround corneto internal isinstance mismatch: runVanillaCarnival imports from corneto._graph while our graphs are corneto.graph._graph; pass SIF tuples instead so it builds its own graph internally - Accept corneto._graph.BaseGraph in to_networkx() so graphs returned by runVanillaCarnival (still old-style) are correctly converted to networkx - Add type: ignore[attr-defined] on cn.methods calls (Pylance false positive; cn.methods is present at runtime in corneto beta) - Add TYPE_CHECKING imports in networkcommons/__init__.py so Pylance resolves networkcommons.eval (previously only set via dynamic globals()) Decoupler 2.x: - dc.run_wmean -> dc.mt.waggr, dc.run_ulm -> dc.mt.ulm - dc.get_ora_df -> dc.mt.query_set; update run_ora default metric from ora_Combined score to ora_stat and update test expectations - Fix recursive loop in run_moon_core to also use dc.mt.* calls - Add _moon_score_layer() fallback: decoupler 2.x raises ValueError on 1-sample matrices because FDR correction fails on NaN t-statistics; fall back to a simple weighted mean (MOON only uses estimates, not pvals) - norm_wmean now equals wmean since waggr has no permutation normalization; remove test assertion that they differ _perturbation.py refactor: - Remove _import_torch() and the torch-as-parameter anti-pattern - Module-level try/except import: torch = None on ImportError - Remove torch param from _torch_dtype, _torch_device, _mml_activation, _make_lembas_model; use module-level torch directly - Guard run_lembas_rnn entry point with explicit ImportError when torch=None Notebook: - docs: dc.get_resource -> dc.op.resource in evaluation vignette Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Tighten Python version range to >=3.10,<3.13 - Document all optional extras (corneto-backends, igraph, torch) - Add GPU/CUDA section: requirements-local.txt pattern for per-machine CUDA wheel selection, installed via pixi's bundled uv or standalone uv - Add Pixi section with dev environment setup instructions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Switch from build.jobs.install shell override to the proper python.install with method: uv, which RTD understands natively. Also bump Python to 3.12. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add rdata>=0.10 as a core dependency (needed by get_hmdb_mapper) - Add pertpy and torch-cu128 optional extras; register pytorch-cu128 index in uv so the CUDA wheel resolves automatically - Add flop pixi feature/environment (R + Bioconductor + Nextflow stack for the FLOP pipeline) - Refactor pixi environments to inherit a base feature; add dev-cu128 env - Update installation.rst to document the new torch-cu128 extra and the dedicated dev-cu128 pixi environment for GPU users - Set nbsphinx_execute = 'never' in conf.py so notebooks are never re-executed during the ReadTheDocs build - Add flop_repo/ to .gitignore Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…isation Notebooks were re-executed end-to-end to validate the modernised environment. Several regressions were caught and fixed: - eval/_metrics.py: decoupler 2.x renamed the column 'Term' → 'source' in ORA results; add rename so downstream code stays compatible - methods/_causal.py: catch and log the exception type + message when CORNETO finds no solution, making silent failures diagnosable - visual/_network_stats.py: widen filepath type to str | None in plot_scatter and create_heatmap (was causing type errors) - data/omics/_lembas.py: lembas_ligands / lembas_tfs now set the first column as the DataFrame index (named 'condition') so callers don't need an extra reset_index step Updated notebook outputs reflect the fixed behaviour. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Downloads HMDB_mapper_vec.RData from the cosmosR GitHub repository, parses it with the rdata library, and returns a dict mapping HMDB IDs (e.g. 'HMDB0000122') to human-readable metabolite names. Result is cached as a pickle in the configured pickle_dir; pass update=True to force a fresh download. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous implementation approximated the LEMBAS architecture from Nilsson et al. (Nat Commun 2022). This commit aligns it with the original R/MATLAB bionetwork codebase: Weight initialisation - Edges: 0.1 + 0.1×rand, negated for inhibitory signs (bionet.initializeWeights) - Bias: 1e-3 everywhere; nodes that receive only inhibitory edges get bias=1 - Input scale: fixed buffer (inputAmplitude), not a learnable parameter - Output projection: per-output scalar init to projection_amplitude (no bias) Training loop - Cosine one-cycle LR schedule peaking at lr_peak (bionetwork.oneCycle) - Mini-batch training (default batch_size=5) with per-batch weight noise (1e-8) - Per-batch input noise: drive += noiseLevel × curLr × randn - Adam with lr=1.0; actual LR injected each epoch; momentum reset every 200 epochs - Weight pre-scaling to spectral radius 0.8 before training (bionet.preScaleWeights) Regularisation - Spectral radius loss: soft exponential penalty with differentiable power iteration (bionetwork.spectralLoss) - Uniform state distribution: mean/var/min/max loss matching bionetwork.uniformLossBatch (replaces old sorted-distribution loss) - Sign regularisation unchanged; ligand bias penalty added (1e-3) - L2 + inverse barrier on edge weights to prevent collapse to zero Defaults updated: epochs=5000, tolerance=1e-6, dtype=float64, uniform_penalty=1e-5, batch_size=5, projection_amplitude=1.2 C_lembas.ipynb: add section 7 (LOOCV) following the evaluation protocol from the original paper; re-executed with fresh outputs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- test_eval_graph: update test_run_ora to expect 'ora_Term' column (our backward-compat rename from decoupler 2.x 'source' → 'Term') - test_utils: move pygraphviz import inside the two tests that need it using pytest.importorskip so the rest of the module runs without it - test_utils: replace fragile try/except + exact dtype string match in test_handle_missing_values_more_than_one_non_numeric_column with pytest.raises + partial match (pandas dtype repr changed across versions) - uv.lock: regenerated to encode the torch vs torch-cu128 extras conflict Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
--locked re-runs the full resolution and fails if the result differs from the committed lock file, which happens when the lock was generated on a different platform (e.g. Linux) and CI runs on another (macOS). --frozen installs exactly the versions in the lock file without re-resolving, which is the correct behaviour for reproducible CI. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
||
|
|
||
| return file_legend | ||
|
|
There was a problem hiding this comment.
On the long term this should use the utils API by omnipath-client and rely on higher level objects
There was a problem hiding this comment.
Investigated this — omnipath-client v0.2.3 is now added as a dependency. However, oc.utils.translate('hmdb', 'traditional_iupac') currently returns HTTP 500 on utils.omnipathdb.org: the utils service only supports cross-database ID mapping (e.g. hmdb → chebi works), not name resolution. Names are served by the separate metabo.omnipathdb.org service via entities/resolve, which has no public wrapper yet in the current version.
For now get_hmdb_mapper keeps the rdata approach with a note in the docstring pointing to the intended migration. A working workaround using OmniPath()._fetch('entities/resolve') is preserved on branch hmdb-mapper-omnipath-workaround for reference, and should be promoted once oc.utils.translate supports metabolite name mapping server-side.
Beyond get_hmdb_mapper, we identified three other places in the codebase that could benefit from switching to omnipath-client long-term:
noi/_node.py— currently callspypath.utils.mapping.map_name()andpypath.utils.orthology.translate()directly;oc.utils.map_name()/oc.utils.orthology_translate()are the intended replacements and would reduce the hard dependency onpypath-omnipath[curl]data/omics/_common.py— usesbiomartfor Ensembl → HGNC symbol mapping;oc.utils.translate()covers this across 97 ID typesdata/network/_omnipath.py— uses the olderomnipathclient;omnipath-clientis the intended successor
…hmdb_mapper
Adds omnipath-client>=0.2.3 as a dependency for future use (node ID
translation, Ensembl mappings, COSMOS PKN via oc.cosmos once available).
get_hmdb_mapper keeps the existing rdata approach for now: the intended
migration to oc.utils.translate('hmdb', 'traditional_iupac') is blocked
by a server-side 500 on utils.omnipathdb.org. A working workaround using
OmniPath()._fetch('entities/resolve') is preserved on branch
hmdb-mapper-omnipath-workaround for when the API matures.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
This PR is the outcome of the Algorithms & Benchmarks session started at the Saez Lab
retreat in Paris (June 17th 2026), which explored how network-based and biologically
informed ML methods can be applied, evaluated, and benchmarked on perturbational
datasets within NetworkCommons. The contribution focuses on the LEMBAS ligand-perturbation dataset and method as a concrete end-to-end integration point.
Core contributions
Dataset integration (
data/omics/_lembas.py):lembas_ligands()andlembas_tfs()now return DataFrames with a properconditionindex, making themdirectly usable in perturbation workflows without manual reshaping.
Faithful LEMBAS-RNN reimplementation (
methods/_perturbation.py): the existingprototype is upgraded to closely match the architecture of Nilsson et al. 2022
(Nat Commun). This implementation is LLM-assisted — a careful human review of the
code against the original repositories is recommended before relying on it in
production. The reference codebases are:
LEMBAS vignette extended (
docs/src/vignettes/C_lembas.ipynb): adds aLeave-One-Out Cross-Validation section (section 7) following the evaluation protocol
of the original paper, alongside a mean-response and ridge baseline for comparison.
Environment modernisation and re-validation
To run the updated notebook, the environment was modernised. The project was migrated
from Poetry to uv using
uvx migrate-to-uv, and additional pixi features were addedto cover dependencies that still required conda-forge or bioconda packages (e.g. R,
Bioconductor, Nextflow for the
flopenvironment). Other additions:rdata>=0.10core dep;
pertpyandtorch-cu128(PyTorch CUDA 12.8 index wired up) optionalextras; dedicated
dev-cu128GPU environment;nbsphinx_execute = 'never'to preventnotebook re-execution on ReadTheDocs.
As a consequence, all vignette notebooks were re-run end-to-end to verify nothing
broke. Several issues were caught and fixed in the process:
eval/_metrics.py: decoupler 2.x renamed the ORA result columnTerm→sourcemethods/_causal.py: CORNETO now surfaces the exception type and message when nosolution is found, replacing a silent failure
visual/_network_stats.py:filepathtype widened tostr | NoneAlso added:
get_hmdb_mapperindata/network/_moon.pyto download and cache theHMDB ID → metabolite name mapping from cosmosR (needed to interpret MOON/COSMOS results
on the LEMBAS network).
Test plan
C_lembas.ipynbruns with the updated LEMBAS-RNN and LOOCV sectionuv run pytestpasses — pygraphviz-dependent tests intest_utilsandtest_vis_networkxare skipped/fail becausepygraphvizis not in the base uv test env (pre-existing onmain, not introduced here)pixi run -e dev pytest tests/test_utils.pypasses fully (25/25) —pygraphvizis available in the dev pixi env via thegraphvizconda-forge packagemethods/_perturbation.pyagainst the reference LEMBAS codebases🤖 Generated with Claude Code