[DRAFT] Evo 2 SAE feature explorer — visualization mockup by polinabinder1 · Pull Request #1582 · NVIDIA-BioNeMo/bionemo-framework

polinabinder1 · 2026-05-26T23:51:04Z

How to use

Prerequisites — install Node.js + npm

Check if you already have it:

node --version && npm --version

If either prints "command not found", install via one of:

Ubuntu / Debian / dev containers (most Lepton-style pods):

curl -fsSL https://deb.nodesource.com/setup_20.x | sudo bash -
sudo apt-get install -y nodejs        # ships npm alongside

macOS (Homebrew):

brew install node

Any OS, version-manager route (nvm — recommended if you juggle Node versions):

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
source ~/.bashrc       # or restart your shell
nvm install --lts

You need Node 18+; this project is tested on Node 20.

Run the dashboard

cd bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/evo2_dashboard_mockup
npm install            # one-time, ~30 s
npm run dev            # starts Vite on http://localhost:5176

Then open http://localhost:5176/#preview — the #preview hash route surfaces the tabbed mockup; the bare / URL still renders the unchanged main dashboard.

⚠️ Mockup — synthetic data, not a real SAE result

This PR ships a demo-only visualization shell at bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/evo2_dashboard_mockup/. There is no real SAE inference involved. Everything you see in the dashboard is generated by scripts/make_mockup_features.py from a fixed seed.

A yellow MOCKUP — synthetic data, not from a real SAE run banner is rendered at the top of every page so nobody mistakes it for real model output.

Summary

This branch contains a multi-tab visualization mockup for the evo2 SAE feature explorer. The latest commit adds:

Steering mode toggle (Position-restricted / Global all positions). In Global mode the per-position bar chart smears toward a low-confidence distribution scaled by |clamp|, and the "FLIPPED" badge becomes "no clean flip — degraded".
SequenceStrip that renders only in Global mode: baseline vs steered argmax across the entire seed sequence, with flipped positions highlighted on a red background. Makes the "global clamp degrades everywhere" point visible at a glance, without needing per-position simulation data.
Scrubbed external author / paper / lab names from UI copy, the MOCKUP banner, the tab label, and stale code comments.

Earlier commits ship the SAE summary table, feature catalog, UMAP atlas, gene-feature G.bin matrix, and the first version of the steering tab.

Test plan

npm run dev starts cleanly, no console errors at #preview
Main tab still renders the feature catalog + atlas + WebLogos
Toggle Position-restricted ↔ Global; verify the Neighbors control disables in Global mode
Drag clamp slider: bar chart smears progressively in Global mode; SequenceStrip flip-fraction increases with |clamp|
Switch seeds and features; SequenceStrip is deterministic per (seed, feature, clamp)

🤖 Generated with Claude Code

torch 2.6 changed the default of `weights_only` to True. The Savanna checkpoint pickle includes numpy globals (`numpy.core.multiarray._reconstruct`), which the safer loader rejects. The converter then exits 0 with no output written and the error gets buried in stderr — silent failure. The Savanna repos under arcinstitute/* are trusted sources, so load with weights_only=False. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Mirrors the existing esm2 / codonfm SAE recipes. Pipeline: chunk -> convert (Savanna->MBridge) -> predict_evo2 -> pt_to_parquet -> train Differences from esm2/codonfm are forced by Evo2 specifics: - Hyena/Megatron-Core model, no HF AutoModel path => reuses the existing `predict_evo2` CLI for inference instead of writing a custom extract.py - `pt_to_parquet.py` shim bridges predict_evo2's .pt output to the universal `sae.activation_store` parquet contract - `chunk_fasta.py` preprocessor keeps inputs within the model's trained context length (8192 bp for 1B); Hyena fftconv OOMs on long sequences even at micro-batch=1 - `train.py` is the same as codonfm's, copied verbatim per bionemo-recipes' KISS-over-DRY convention Validated end-to-end on 100 organelle sequences (Evo2 1B layer 12): loss 0.67 -> 0.045, FVU 0.90 -> 0.10, var_exp 0.10 -> 0.90, 2m14s wall. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The recipe currently has no model-specific Python module — the extractor is upstream (`predict_evo2`) and the two scripts are simple CLIs in scripts/. Drop the empty package and adjust pyproject.toml so setuptools doesn't try to discover anything. Will reintroduce when there's actual library code to put there (eval, dashboard, dataloaders). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Fork of recipes/codonfm/codon_dashboard adapted for DNA + Evo 2, populated with synthetic data. Demo-able artifact, not a real result. What's here: - scripts/make_mockup_features.py: deterministic synthetic data generator (seed 42). Writes features_atlas.parquet, feature_metadata.parquet, feature_examples.parquet to evo2_dashboard_mockup/public/. Fixtures are committed for one-step npm-only setup. - evo2_dashboard_mockup/: Vite/React SPA forked from codon_dashboard with these swaps: * Removed molstar dep + MolstarThumbnail.jsx * Renamed ProteinSequence.jsx -> SequenceView.jsx; per-base rendering (no codon framing, no AA translation) * Renamed ProteinDetailModal.jsx -> RegionDetailModal.jsx; UniProt content swapped for genomic-region content * utils.js: getRegionLabel + parseBases (replacing getAccession/uniprotUrl/parseCodons/codonToAA) * MOCKUP banner at top of App * "Evo 2 SAE Feature Explorer (Mockup)" title - v2 roadmap placeholders (greyed em-dashes with hover tooltips): * FeatureCard: Annotation, Sensitivity, Recon Δ stats * FeatureDetailPage: Annotations, Conservation sections Quick start: cd evo2_dashboard_mockup && npm install && npm run dev The synthetic data schema is the contract the future real eval pipeline will need to target. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ed features Three changes on top of the initial mockup commit: 1. Drop codonfm-specific scaffolding from forked components. - .gitignore the auto-generated package-lock.json (regenerates on `npm install`) - FeatureCard.jsx: 793 -> 508 lines. Removed dead stat tiles (Hi-Score, Variant/Site/Local deltas, ClinVar, PhyloP, GC, Trinuc/Gene entropy), codonfm vocab-logits chart, codonfm GSEA tags, codonfm CSV export sections — all conditional on fields our synthetic data doesn't provide. - FeatureDetailPage.jsx: 522 -> 187 lines. Replaced codonfm-specific VocabLogitChart / CodonAnnotations / FeatureMetrics components with a simpler DNA-friendly detail view. 2. Refine the synthetic feature set. - 11 labeled DNA-native features in 3 thematic UMAP clusters: * eukaryotic regulatory (TATA box, polyA signal, CpG island, splice donor, splice acceptor) * bacterial regulatory (-10 box, -35 box, Shine-Dalgarno) * codon context (start ATG, stop TAA, stop TAG) - 9 unlabeled features in a 4th diffuse cluster (label=NULL, db_source=NULL) — mimics the realistic case where most SAE features are uninterpreted. - New `db_source` column on each feature (RefSeq / JASPAR-ENCODE / bacterial annotation / RefSeq UTR / ENCODE-RefSeq / NULL). 3. Bug fixes for cross-pod port-forward demo: - App.jsx defaults: `selectedCategory` and `histMetric3` were hardcoded to codonfm's `mean_variant_1bcdwt` column, which doesn't exist in our atlas and threw Binder errors. Switched to `cluster_id`. - Atlas column rename: `cluster` -> `cluster_id` to match what App.jsx queries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

copy-pr-bot · 2026-05-26T23:51:07Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-05-26T23:51:11Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: decdc87f-5a62-4d55-bcbe-69b2d67983ee

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Each of the 11 labeled features now ships with a PWM-driven sequence logo rendered into public/logos/feature_{id}.png by logomaker. Central signatures are spec'd per label (Kozak ATG, TATA, polyA, CpG, Shine- Dalgarno, bacterial -10/-35, splice donor/acceptor, stop TAA/TAG); flanks are uniform 0-bit so the logos read as clean motif summaries rather than noisy speckle. Unlabeled features get no logo — their cards skip the section entirely. make_mockup_features.py grows _build_pwm() and _render_logo(); the metadata/atlas parquets carry a logo_path column; App.jsx detects it optionally and excludes it from category detection; FeatureCard's expanded view and FeatureDetailPage display the logo above the top- activating-sequences list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two new visualizations for the SAE interpretability dashboard, plus the offline pipeline that produces the gene-UMAP precompute bundle. scripts/generate_fake_genes.py 500-row genes.tsv stand-in (gene_symbol, species, sequence) until a real curated catalog lands. Realistic-ish distributions across 7 species. scripts/gene_umap_precompute.py End-to-end offline pipeline: genes.tsv -> Evo2 1B layer-20 -> TopK SAE encode -> mean per gene -> UMAP (cosine) -> HDBSCAN clusters -> per-feature firing stats. Writes G.npz, genes_umap.parquet, feature_stats.parquet, manifest.json. Reuses predict_evo2 via torchrun subprocess; aggregates .pt files by seq_idx + pad_mask. Idempotent (skips predict if .pt files exist). src/ColoredSequence.jsx React component: paste a DNA sequence -> each base background-colored by its top-firing SAE feature, opacity scaled by activation strength. Two modes: top-feature (default), single-feature lookup. Builds mock activations internally when no `analysis` prop is supplied so the component works standalone before the /analyze backend is wired. Tableau-10 colorblind palette, hover tooltip with top-5 features, legend sorted by per-color position count. src/GeneUMAPView.jsx Renders the 500-gene UMAP via canvas. Loads G.bin (raw float32), genes_meta.json, feature_stats.json from public/gene_umap/. Click a feature in the sidebar -> instant recolor by activation strength (no recompute). Click Reorganize -> re-runs UMAP client-side with feature-weighted vectors (umap-js, ~2-5s at N=500), animates the transition with ease-in-out cubic. Hover shows gene metadata + top 5 firing features. src/Preview.jsx + src/index.jsx Tabbed entry at /#preview: "Main" (the existing dashboard, untouched), "ColoredSequence", "Gene UMAP". Hash-gated so / still goes to the unchanged production layout. The ColoredSequence tab includes a paste textarea so users can drop their own sequences in. public/gene_umap/ Precomputed bundle for the GeneUMAPView (G.bin 30 MB, plus small JSON metadata + per-feature stats filtered to n_firing >= 10). Dep change: umap-js for client-side reorganize. Generated genes are synthetic; replace fake_genes.tsv with a real curated 500-gene list and re-run the precompute when one is available. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Force-added past the *.bin gitignore so coworkers can pull and run the dashboard end-to-end without re-running the GPU precompute. Without this file GeneUMAPView fails to load. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three-column comparison view for steering a chosen SAE feature at a masked position. All synthetic — 14 hand-rolled (seed, feature) pairs in public/steering_examples.json, including 6 deliberately marked as null results so the demo shows honestly that not every steering attempt works. - Instant-apply controls (no cosmetic Run button) - A/C/G/T probability bars (DNA tokenization, matches Evo2) - Sticky diff summary above columns with effect-size badge - 16S × kanamycin_resistance pair illustrates the A1408G mutation - Disabled feature options for pairs without data; graceful fallback message when an unsupported combination is selected - 4th tab in Preview.jsx, reuses existing tab pattern (no router added) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pares the preview tabs down to the two that matter for now: /#preview tab 1: Main (existing feature catalog + atlas + WebLogos) /#preview tab 2: Steering explorer (slider + per-position P(ACGT) heatmap) Removes the ColoredSequence, Gene UMAP, and SAE Summary tabs along with their data, scripts, and components. The full 5-tab version is preserved on the evo2-sae-dashboard-full-mockup branch if we want to revive any of those views later. Removed: - src/ColoredSequence.jsx, src/GeneUMAPView.jsx, src/SteeringComparison.jsx, src/SAESummary.jsx - public/gene_umap/ (G.bin, genes_meta.json, feature_stats.json) - public/steering_examples.json (replaced by steering_data.json) - public/sae_qc_summary.json - scripts/gene_umap_precompute.py, scripts/generate_fake_genes.py Kept / added: - src/SteeringExplorer.jsx (slider + per-position heatmap) - public/steering_data.json (14 pairs × 200 positions × 4 clamps mock) - scripts/generate_steering_data.py (regenerates the JSON) - src/Preview.jsx trimmed to 2 tabs, no more state-heavy local logic Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Final UX pass on the SteeringDemo: - Feature catalog trimmed to the two AMR features (kanamycin_resistance, streptomycin_resistance) — non-AMR features removed since this demo specifically reproduces the Hutchinson 2025 A1408G headline - "Feature to steer" is a dropdown picking the primary feature - "Also clamp" checkboxes let users co-clamp the other AMR feature alongside the primary; clamp slider applies to all selected - Neighbors-clamped buttons extended from 0/1/2 to 0/1/2/3/4 - Selectivity table + narrative callout removed earlier in the same iteration; just the dropdown + co-clamp + bar comparison stays JSON updated: comparisons now cover all 4 seeds × 2 AMR features (8 pairs total). Non-AMR seeds (promoter / brca1_exon / random) show null-result distributions — demonstrating that AMR features don't shift predictions where they have no biological purchase. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- add Steering mode toggle (Position-restricted / Global all positions) - global mode smears the per-position bar chart toward a low-confidence distribution scaled by |clamp|, swaps the FLIPPED badge for "no clean flip - degraded" - add SequenceStrip showing baseline vs steered argmax across the whole seed sequence with flipped positions highlighted; only renders in global mode - remove author names / paper titles / external model labels from UI copy, banner, tab label, and code comments Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

polinabinder1 and others added 6 commits May 21, 2026 00:42

Merge branch 'main' into evo2-sae-recipe

2760eed

polinabinder1 and others added 7 commits May 27, 2026 18:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT] Evo 2 SAE feature explorer — visualization mockup#1582

[DRAFT] Evo 2 SAE feature explorer — visualization mockup#1582
polinabinder1 wants to merge 13 commits into
NVIDIA-BioNeMo:mainfrom
polinabinder1:evo2-sae-dashboard

polinabinder1 commented May 26, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented May 26, 2026

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

polinabinder1 commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to use

Prerequisites — install Node.js + npm

Run the dashboard

⚠️ Mockup — synthetic data, not a real SAE result

Summary

Test plan

Uh oh!

copy-pr-bot Bot commented May 26, 2026

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

polinabinder1 commented May 26, 2026 •

edited

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading