Michael revisions#292
Merged
Merged
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
cmatKhan
added a commit
that referenced
this pull request
Jun 12, 2026
* basically done with requests. need to check content * this works, but is slow on shinyapps * deployed site working with ~30 second loads at worst * this is working with new promoter sets. havent added CC new promoters yet * adjusting where config logger lives * all promoter sets in * removing approx * adding changelog and publish action --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR is a broad set of revisions focused on improving runtime performance (materializing key DuckDB views, bounding per-query memory, adding perf logging), modernizing the UI (matrix-based correlation/comparison workflows), and streamlining packaging/deployment (new launch CLI, PyPI publish workflow, updated docs).
Changes:
- Materialize frequently-scanned VirtualDB parquet-backed views into projected in-memory DuckDB tables to reduce slow-disk latency.
- Refactor Binding/Perturbation UIs to use correlation matrices + pair distributions + gene scatter, with stable namespaced click IDs.
- Update CLI/deployment tooling (new
launchcommand, JSON logging, publish workflow) and revise docs accordingly.
Reviewed changes
Copilot reviewed 39 out of 40 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tfbpshiny/utils/vdb_materialize.py | New startup-time projected materialization of key VDB views into RAM. |
| tfbpshiny/utils/vdb_init.py | Adds responsiveness presets/labeling and wires startup materialization; tweaks defaults/filters. |
| tfbpshiny/utils/topn_matrix.py | New UI builder for binding×perturbation top‑N matrix. |
| tfbpshiny/utils/perf.py | Adds RSS reporting and kind classification to perf records. |
| tfbpshiny/utils/correlation_matrix.py | New UI builder for upper-triangle correlation matrix. |
| tfbpshiny/modules/select_datasets/ui.py | Fixes selectize selected-value typing; modal UX tweaks (“Queue Filters”, easy_close). |
| tfbpshiny/modules/select_datasets/server/workspace.py | Adds DOI links in dataset matrix labels; fixes namespacing of matrix click buttons. |
| tfbpshiny/modules/select_datasets/server/sidebar.py | Removes export flow; improves pending-change detection and filter coercions. |
| tfbpshiny/modules/select_datasets/queries.py | Adjusts diagonal sample counting; removes full_data_query. |
| tfbpshiny/modules/select_datasets/export.py | Removes dataset-export tarball helpers. |
| tfbpshiny/modules/perturbation/ui.py | Updates UI text/tabs and measurement choices (adds -log10(p) option). |
| tfbpshiny/modules/perturbation/queries.py | Expands measurement-column mapping and adds log10(p) helpers; shifts filters to meta-based sample_id subquery. |
| tfbpshiny/modules/perturbation/init.py | Lazy imports via __getattr__ to reduce import-time overhead. |
| tfbpshiny/modules/home/ui.py | Makes home cards clickable navigation links; adds target map. |
| tfbpshiny/modules/comparison/ui.py | Restructures comparison UI, adds responsiveness presets selector, renames tabs. |
| tfbpshiny/modules/comparison/queries.py | Adds chunking by regulators, meta-based filtering, target intersection-before-ranking, more promoter/method variants. |
| tfbpshiny/modules/binding/ui.py | Updates measurement options and tab layout; defaults to Spearman + -log10(p). |
| tfbpshiny/modules/binding/server/workspace.py | Major workspace refactor: matrix selection + committed selection gating, progressive rendering, perf instrumentation, log10(p) transforms. |
| tfbpshiny/modules/binding/queries.py | Expands measurement-column mapping and adds log10(p) helpers; shifts filters to meta-based sample_id subquery; bounds memory by per-pair execution. |
| tfbpshiny/modules/binding/init.py | Lazy imports via __getattr__. |
| tfbpshiny/configure_logger.py | New JSON logger formatter + logger configuration helper. |
| tfbpshiny/components.py | Reworks matrix cell buttons to plain <button> + Shiny.setInputValue; removes export button component; row label accepts Tags. |
| tfbpshiny/brentlab_yeast_collection.yaml | Adds new promoter region sets and new binding dataset variants. |
| tfbpshiny/app.py | Logger import fix, nav title change, home-card navigation effects, loading banner styling. |
| tfbpshiny/app.css | Adds pending-banner styling + loading placeholder animations; removes export button styles. |
| tfbpshiny/main.py | New launch command flow with cache dir defaults, optional init, materialization toggle, updated defaults. |
| tests/unit/test_select_datasets.py | Removes tests for removed full_data_query. |
| tests/unit/test_matrix_namespacing.py | New tests ensuring namespaced IDs for matrix click buttons. |
| tests/unit/test_export.py | Removes tests for removed export feature. |
| shinyapps_entry.py | Sets default debug log level env var for shinyapps. |
| README.md | Rewrites quickstart around launch + cache behavior; moves deployment details to docs. |
| pyproject.toml | Bumps version to 1.0.0. |
| docs/sql_operations.md | New SQL operations reference (needs alignment with new execution strategies). |
| docs/development.md | Expands/moves production + shinyapps deployment docs. |
| CLAUDE.md | Updates run instructions to use launch. |
| CHANGELOG.md | Adds 1.0.0 release notes. |
| .gitignore | Ignores notebooks/ and default HF cache dir. |
| .github/workflows/publish.yml | Adds PyPI publish workflow on GitHub releases. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+104
to
+105
| if level not in [logging.DEBUG, logging.INFO, logging.WARNING, logging.ERROR]: | ||
| raise ValueError("Invalid logging level") |
Comment on lines
+105
to
+110
| # Named presets for per-dataset responsiveness definitions in the Comparison module. | ||
| # Add or modify entries here to tune what counts as a "responsive" target. | ||
| # Columns used per dataset are defined in perturbation/queries.py::DATASET_COLUMNS. | ||
| # NOTE: degron uses the "pvalue" column (raw DESeq2 p-value), NOT padj. If padj | ||
| # is preferred, add "padj" to DATASET_COLUMNS["degron"] and update the comment. | ||
|
|
Comment on lines
+22
to
+26
| **File:** `tfbpshiny/modules/select_datasets/queries.py` | ||
| **Called from:** `server/workspace.py`, `server/sidebar.py`, `server/dataset_row.py` | ||
|
|
||
| All nine functions in this file are **Builders** — they return `(sql, params)` and never | ||
| call `vdb.query()` themselves. |
Comment on lines
+151
to
+162
| ```sql | ||
| SELECT '{db_name}' AS db_name, | ||
| COUNT(DISTINCT regulator_locus_tag) AS n_regulators, | ||
| COUNT(DISTINCT sample_id) AS n_samples | ||
| FROM {db_name}_meta | ||
| [WHERE {filter_clauses}] | ||
|
|
||
| UNION ALL | ||
|
|
||
| SELECT '{db_name_2}' AS db_name, ... | ||
| ... | ||
| ``` |
Comment on lines
+223
to
+239
| ### `full_data_query` | ||
|
|
||
| **Invocation:** Builder | ||
| **sql_only path:** N/A | ||
| **Purpose:** Fetch all columns from the full data view (measurement + metadata joined) for | ||
| one dataset, optionally filtered. | ||
|
|
||
| ```sql | ||
| SELECT * FROM {db_name} | ||
| [WHERE {filter_clauses}] | ||
| ``` | ||
|
|
||
| **Parameters:** Filter params as for `metadata_query`. | ||
| **Result columns:** All columns in `{db_name}` (schema varies by dataset). | ||
| **Approximate result size:** TBD — potentially large (millions of rows for full datasets). | ||
|
|
||
| --- |
Comment on lines
+316
to
+326
| **Invocation:** Executes | ||
| **sql_only path:** No | ||
| **Purpose:** Compute correlations for all active binding dataset pairs in a single query. | ||
| Wraps one `corr_pair_sql` subquery per pair in a `UNION ALL`. | ||
|
|
||
| ```sql | ||
| SELECT *, '{db_a}__{db_b}' AS pair_key FROM ( {corr_pair_sql for pair 0} ) | ||
| UNION ALL | ||
| SELECT *, '{db_a}__{db_b}' AS pair_key FROM ( {corr_pair_sql for pair 1} ) | ||
| ... | ||
| ``` |
Comment on lines
+594
to
+604
| **Invocation:** Executes | ||
| **sql_only path:** No | ||
| **Purpose:** Compute `topn_responsive_ratio` for all (binding, perturbation) pairs in one | ||
| UNION ALL query. The main query run by the Comparison tab's Execute Analysis. | ||
|
|
||
| ```sql | ||
| SELECT *, '{b_db}__{p_db}' AS pair_key FROM ( {topn_responsive_ratio for pair 0} ) | ||
| UNION ALL | ||
| SELECT *, '{b_db}__{p_db}' AS pair_key FROM ( {topn_responsive_ratio for pair 1} ) | ||
| ... | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.