feat(versioning): capture and expose version history for charts, dashboards, and datasets#39603
Draft
mikebridge wants to merge 81 commits into
Draft
feat(versioning): capture and expose version history for charts, dashboards, and datasets#39603mikebridge wants to merge 81 commits into
mikebridge wants to merge 81 commits into
Conversation
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 23, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 24, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8774778 to
a1f0ddb
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #39603 +/- ##
==========================================
- Coverage 64.30% 56.04% -8.27%
==========================================
Files 2657 2686 +29
Lines 144060 146084 +2024
Branches 33216 33667 +451
==========================================
- Hits 92641 81866 -10775
- Misses 49797 63437 +13640
+ Partials 1622 781 -841
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
6 tasks
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 27, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7979999 to
70e21bc
Compare
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 28, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
db1b08c to
c338d49
Compare
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 30, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
May 4, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c338d49 to
04c6b8b
Compare
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
May 5, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
d632e5e to
e4f548e
Compare
5549d67 to
f031a2c
Compare
f031a2c to
c2b6db7
Compare
Drop "Published Language" from the ACTION_KINDS docstring header.
The substance ("single source of truth shared by the four command
sites and the listener") reads the same either way; the canonical
vocabulary added a lookup step for readers without a DDD background.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The retention task accumulates a list of ``tx_ids`` to prune and then issues two passes of DELETE statements (shadow tables + final ``version_transaction``) with ``WHERE id IN (:tx_ids)``. The preserved-ids computation in ``_candidate_transaction_ids`` also binds the same list via ``transaction_id IN (...)``. Postgres and MySQL accept tens of thousands of bind parameters; SQLite caps at ``SQLITE_MAX_VARIABLE_NUMBER = 999`` (raised to 32766 in 3.32+, but the older limit still ships in many builds). On a deployment with months of disabled retention and then a re-enable, the candidate count easily exceeds 1000. Chunk all three IN-clauses at 500 ids per statement via a ``_chunked`` helper. 500 leaves headroom for the ``(transaction_id, end_transaction_id)`` OR-pair (each id is bound twice in the shadow DELETE) plus margin for any other bound params. The retention task is the only path that accumulates open-ended id batches; other versioning DELETEs stay well within bounds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
W5: short-circuit ``_force_parent_dirty_on_child_change`` when the resolved parent is already in ``session.new``. The parent will INSERT in this flush regardless, so the ``flag_modified`` call was redundant (the ``InvalidRequestError`` swallow below was also working for this case but silently). Common on imports that add a SqlaTable plus 50 fresh TableColumn children — ~50 redundant ``flag_modified`` calls per import. W6: ``_pin_audit_columns`` now logs at INFO when both ``flag_modified`` calls fail on a parent that has the audit columns. Previously the silent skip left the synthetic parent UPDATE to invoke ``onupdate=get_user_id`` and write whoever ``g.user`` is at flush time — which under autoflush-during-teardown points at a deleted test user and fails the FK. Now the failure mode is debuggable from the log without inspection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
``version_changes.(entity_kind, entity_id)`` references the live row's integer PK on ``slices`` / ``dashboards`` / ``tables`` depending on ``entity_kind``. SQL has no native polymorphic FK, so the constraint is intentionally omitted — cleanup relies on the CASCADE from ``version_transaction.id`` plus command-layer ordering for entity deletes. A bare DELETE outside that transactional boundary leaves orphan rows whose entity_id references a vanished row; the read-side tombstone-state lookup handles this gracefully. Adds the rationale to the migration header so a future reader doesn't mistake the missing constraint for an oversight. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…chemy W1)
The retention task's SERIALIZABLE block has a documented happy-path
recovery (Celery wrapper logs + returns {"error": 1} so the next
firing retries) — but the next firing is 24 hours out under the
default daily beat schedule. Under sustained write pressure on a
heavily-edited deployment, the prune can fail to make progress for
many days in a row with only a single log line per attempt.
Add a bounded inline retry inside ``_prune_old_versions_impl``:
* The SERIALIZABLE block is extracted to ``_run_prune_pass`` so each
attempt opens a fresh connection + transaction from a clean
snapshot (the previous one is implicitly rolled back when the
context manager exits via exception).
* Up to ``_MAX_RETRY_ATTEMPTS`` = 3 attempts, with exponential
backoff at ``_RETRY_BACKOFF_BASE_SECONDS`` = 0.1s (0.1s → 0.4s for
a total ~0.5s extra latency in the worst case).
* On exhaustion the underlying ``OperationalError`` re-raises so the
Celery wrapper's existing handler logs and returns ``{"error": 1}``
— the inline retry shrinks the recovery window for transient
conflicts; it does not replace the outer safety net.
* When at least one retry was needed the stats dict carries
``"retried": <count>`` for observability.
Two integration tests cover the new behavior end-to-end against the
existing dashboard fixture: one verifies the retry-then-succeed path
and that ``stats["retried"] == 1``; the other verifies the give-up
path raises after exactly ``_MAX_RETRY_ATTEMPTS``. ``time.sleep`` is
patched in both so the tests run in the same wall-clock budget as the
existing retention test.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… exports PEP 8's ``_name`` convention says "module-private". Three names in the changes/ package were imported across module boundaries — the underscore was lying about their actual visibility: * ``_ENTITY_KIND_BY_CLASS_NAME`` → ``ENTITY_KIND_BY_CLASS_NAME`` (imported by changes/listener.py, versioning/queries.py, activity/kinds.py — the canonical Python-class-name → table-stored-kind mapping shared across both write and read paths) * ``_shadow_rows_valid_at`` → ``shadow_rows_valid_at`` (imported by versioning/queries.py for the validity-strategy child-collection read used by GET /versions/<uuid>/) * ``_jsonable`` → ``jsonable`` (imported by changes/shadow_queries.py from changes/state.py — cross-submodule within the package) Helpers that stay internal to one submodule keep their underscore. The package's ``__init__.py`` re-exports and ``__all__`` are updated to match; the docstring naming the package layout reflects the new spellings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…/ exports PEP 8's ``_name`` convention says "module-private"; the baseline/ package was carrying nine names through its ``__init__.py`` / sibling submodules that are not actually private. Renamed: * ``_child_to_parent_registry`` → ``child_to_parent_registry`` * ``_collect_parents_to_baseline`` → ``collect_parents_to_baseline`` * ``_shadow_row_count`` → ``shadow_row_count`` * ``_version_table_for`` → ``version_table_for`` * ``_pin_audit_columns`` → ``pin_audit_columns`` * ``_force_parent_dirty_on_child_change`` → ``force_parent_dirty_on_child_change`` * ``_insert_baseline_and_children`` → ``insert_baseline_and_children`` * ``_CHILD_BASELINE_HANDLERS`` → ``CHILD_BASELINE_HANDLERS`` * ``_insert_baseline_shadow_row`` → ``insert_baseline_shadow_row`` Imports updated across the package (collection, dirty, children, insertion, listener, shadow, __init__) plus the two external consumers (``superset/versioning/factory.py``, ``tests/unit_tests/versioning/test_pin_audit_columns.py``). Doc-only references in ``skip_unmodified_tests.py`` follow. Helpers that stay private to one submodule keep their underscore. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
``force_parent_dirty_on_child_change`` was an 88-line function with
three nested conditionals, doing five distinct things (filter
phantom-dirty children, resolve parent, short-circuit already-new
parents, pick a safe flag column, pin audit columns). Split into
named helpers so the top-level reads as a sentence:
* ``_real_dirty_versioned_children(session, child_map)`` — generator
yielding child instances that are real edits (filters phantom-dirty
entries from lazy-load side effects / audit auto-bumps).
* ``_resolve_parent(child, child_map)`` — registry lookup with the
"is the parent attribute loaded and the expected class?" guard.
* ``_flag_parent(parent)`` — picks the deterministic non-excluded
column (description > uuid > col_keys[0]), ``flag_modified``s it,
returns ``False`` on the fresh-session.new-instance case.
The top-level loop now reads as:
for child in _real_dirty_versioned_children(session, child_map):
parent = _resolve_parent(child, child_map)
if parent is None or parent in new_set:
continue
if _flag_parent(parent):
pin_audit_columns(parent)
Behaviour preserved exactly. The mid-function comment blocks that
previously explained the next three lines migrate to the docstrings
of the helpers — code-as-documentation instead of comment-as-
documentation.
Also fixes a stale patch target in ``test_pin_audit_columns_tolerates_
invalid_request_error``: the test was patching
``superset.versioning.baseline.attributes.flag_modified``, but
``attributes`` is imported inside ``dirty.py``; the correct target is
``superset.versioning.baseline.dirty.attributes.flag_modified``. The
test was passing previously because the broken patch silently no-op'd
on the side-effect path; under the extraction the lookup-resolution
ordering surfaced the error.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Every versioning helper that accepts a SQLAlchemy class actually expects ``Slice`` / ``Dashboard`` / ``SqlaTable`` — all subclasses of ``flask_appbuilder.Model``. The bare ``model_cls: type`` signature accepted ``type[int]`` and any other class; type-checking caught nothing at the call sites. Tighten to ``model_cls: type[Model]`` across the versioning helpers: ``queries.py`` (10 signatures), ``restore.py`` (1), ``api_helpers.py`` (4), ``changes/state.py`` (2). ``etag.py`` already used the tightened form; the rest now match. No behavior change. ``mypy`` would catch a non-Model class at the call site. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ership Two follow-ups to the v3 committer review's operator-facing concerns: * "Querying the shadow tables — audit columns are frozen at capture time" — the three parent shadow tables exclude ``changed_on``, ``created_on``, ``changed_by_fk``, ``created_by_fk`` from version capture; external tooling that joins ``*_version`` to ``ab_user`` via ``changed_by_fk`` will read stale baseline values. The correct join is through ``version_transaction.user_id``. Includes the SQL shape so a reader can copy it. * "Behavior change — ``ImportExportMixin.reset_ownership``" — the ownership-reset helper used by every import/clone/duplicate path was rewritten so a present ``g.user`` is actively assigned (was ``None``-then-default). The change affects every command that uses the helper, not just versioning-adjacent ones; operators noticing consistent author attribution on imports are seeing this change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…bles, backoff factor
Four small follow-ups from the v4 review cycle:
* `commands/version_restore.py:60` — `model_cls: type` → `type[Model]`
on ``BaseRestoreVersionCommand``. The ``c676d66cd9`` type tightening
swept the helpers but missed the command's class attribute. Catches
a non-Model subclass at mypy time.
* `changes/state.py` — drop the underscore prefix on
``_compute_records_for_entity`` and ``_bulk_insert_records``. Both
are imported by ``changes/listener.py``; the earlier naming-honesty
pass (`35c66ff496`) missed them. Same criterion: a name imported
by a sibling submodule is not module-private and shouldn't claim to
be.
* `tasks/version_history_retention.py:_run_prune_pass` — collapse the
five positional arguments (cutoff + parent_tables + child_tables +
m2m_table + tx_table) into a frozen ``ShadowTables`` dataclass. The
four table fields are co-resolved by ``_resolve_shadow_tables`` and
flow through the retry loop as a single bundle. Same Parameter
Object shape as ``RestoreEndpointSpec``.
* Same file — name the exponential-backoff multiplier
(``_RETRY_BACKOFF_FACTOR = 4``) instead of writing the magic ``4``
inline. The constant's name documents what the integer means; the
previous prose comment ("BASE + BASE*4 = ~0.5s") moves into the
docstring on the constant.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a default-on config flag gating the two ``register_*_listener()`` calls in ``init_versioning``. When the flag is ``False``, neither the baseline nor the change-record listener attaches to ``db.session``; save-path capture stops, but every other versioning surface (already- captured shadow rows, ``/versions/`` reads, ``/activity/`` reads, the retention task) continues to work. This is an *operational* switch — a 30-second recovery path for a versioning-induced save-path regression — not a feature flag. New deployments leave it on. The flag's docstring + UPDATING.md note are explicit about that distinction so it doesn't get repurposed into a "are we ready to enable versioning?" gate. Reads from ``ENABLE_VERSIONING_CAPTURE`` env var with default ``"true"``; ``superset_config.py`` can also assign the Python value directly. The startup log emits a WARNING when the listener registration is skipped so the configuration choice is visible in the deploy log. From the v4 continuous-delivery review (Farley + Humble): the prior shape's MTTR was bounded only by the team's ability to revert and redeploy the stack. With this flag, MTTR for capture-induced regressions drops to one config edit + a worker restart. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three operational signals on the nightly prune, matching the activity-view orchestrator's emission shape so a single Grafana filter ``superset.versioning.*`` catches both sides of the feature: * ``superset.versioning.retention.pruned_transactions`` — gauge of rows actually pruned this run. ``0`` is a meaningful signal (means retention is enabled but nothing aged out); a sustained ``0`` over multiple runs is the signature of a misconfigured beat schedule that previously was log-only. * ``superset.versioning.retention.skipped`` — counter; fires for the ``retention_days <= 0`` (operator disabled) and the ``no versioned classes resolved`` (init-order regression) branches. Lets an alert distinguish "disabled" from "running and producing zero" without log scraping. * ``superset.versioning.retention.retried`` — counter; fires on every serialization conflict that triggered an inline retry. A sudden rise correlates with concurrent-write pressure and is the leading indicator for the ``OperationalError`` give-up path that re-raises after ``_MAX_RETRY_ATTEMPTS``. The activity-view side already uses ``stats_logger_manager`` for the per-phase ``superset.activity_view.*`` timings; the retention side was log-only. Bringing it up to the same standard closes the v4 CD review's "observability deficit on the nightly job" concern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Operators who redefine ``CeleryConfig`` in ``superset_config.py`` — instead of subclassing or merging the default — silently lose the ``version_history.prune_old_versions`` entry that registers the nightly prune. The capture path keeps writing shadow rows; the prune never runs; disk grows until paged. Add a startup check inside ``init_versioning`` that inspects the resolved ``CELERY_CONFIG.beat_schedule`` and emits a WARNING when the entry is absent. The misconfiguration is now visible in the deploy log instead of waiting for disk pressure to surface it at 03:00 some weeks later. Same shape as the existing ``CORS_OPTIONS["expose_headers"]`` operator note in UPDATING.md — a known-misconfiguration mode the codebase catches at startup so the team doesn't relearn it in production. From the v4 continuous-delivery review (Farley + Humble): "hidden coordination" anti-pattern — the change-set assumes the operator will do something correct (merge their override with the new default) that the code does not verify at runtime. This commit verifies it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The migration's ``downgrade()`` is correct against an empty schema — verified throughout development — but the realistic operational scenario is: we deployed, accumulated versioning rows from real saves for hours or days, then need to roll back. That code path was unexercised, and the CD review v4 flagged that absence as load- bearing: an operator hitting the rollback in anger is a worse place to find a downgrade bug than CI. New test file ``versioning_round_trip__tests.py`` matches the existing ``composite_pk_round_trip__tests.py`` pattern (in-memory SQLite + Alembic ``MigrationContext``) and walks three scenarios: * **Populated round-trip** — upgrade both migrations, insert rows into all 8 versioning tables (live + closed shadow rows, both parent and child, with field-level ``version_changes`` records and M2M shadow rows), downgrade both migrations, assert every table is gone, then re-upgrade and assert the schema shape matches the first upgrade byte-for-byte (idempotency under round-trip). * **Empty downgrade** — sanity belt-and-braces that downgrade run immediately after upgrade (no rows) is also clean. Catches the case where the population step somehow influenced the drop path. * **Indexes-downgrade idempotency** — runs ``8f3a1b2c4d5e.downgrade`` twice in a row. The second call must be a no-op (the migration uses ``if_exists=True`` on every drop) so an operator who interrupts and re-runs doesn't hit a missing-index error. The MEDIUMTEXT cross-backend dimension is delegated to the CI matrix (SQLite collapses every text column to TEXT regardless of declared type); the shape pinned here is reversibility under load. From the v4 continuous-delivery review's "untested operational rollback" finding. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d code) Four small follow-ups from the v5 review cycle, all on the operational infrastructure that landed in v4: * **`_warn_if_retention_beat_missing` runs before the kill-switch early-return** (clean-code C1 / tidy-first W1 / python W3). The retention task lives in its own beat entry and runs against existing shadow data regardless of capture; the warn-log was previously bypassed on exactly the deployment path (kill-switch flipped in anger) where the operator is most likely to also have a hand-rolled ``CeleryConfig`` with a missing prune entry. Move the call above the early-return so both misconfigurations surface at the same restart. * **`_warn_if_retention_beat_missing` handles dict-form and None `CELERY_CONFIG`** (python W2 / sqlalchemy W1 / CD-2). The default shape is ``type[CeleryConfig] | None``, but Celery itself accepts a dict via ``config_from_object``, and ``None`` is the documented "disable Celery entirely" path. The prior ``getattr(_, "beat_schedule", None) if _ else None`` fell through to the WARNING in both cases, emitting a false positive for operators who chose either shape on purpose. Discriminate by ``isinstance(dict)`` and short-circuit on ``None``. Also extract the retention task name to a class-level ``_RETENTION_TASK_NAME`` constant so the previous 90-char line shortens and the literal is no longer duplicated against the default in ``config.py``. * **`ENABLE_VERSIONING_CAPTURE` env-var-to-bool uses ``utils.parse_boolean_string``** (python W1). The hand-rolled ``.lower() == "true"`` only matched the literal ``"true"``; operators setting ``1``, ``yes``, ``on``, ``True`` (no .lower call) silently got ``False`` (capture on — the safe direction for a default-on kill-switch, but surprising and inconsistent with the rest of ``config.py`` which uses the helper). * **`from __future__ import annotations` in `versioning/factory.py`** (python W4). Every other versioning module has it; this was the lone outlier. PEP 604 union syntax in a local-variable annotation worked without it on Python 3.10+, so this is consistency, not correctness. * **Drop the dead `last_exc` / unreachable `RuntimeError`** in ``_prune_old_versions_impl`` (tidy-first dead-code finding). The retry loop always returns or re-raises; the post-loop fallback was defensive code for a control-flow path the loop's exit condition cannot reach. Replaced with a short ``AssertionError`` that mypy needs for type-narrowing and that documents the invariant (post-loop = "the loop's exit condition was changed incorrectly"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…etrics
Three v4 operational features shipped without direct test coverage; the
v5 review flagged this as load-bearing for the next refactor pass. Add
ten focused unit tests:
**`tests/unit_tests/initialization_test.py::TestInitVersioning`** — six
tests covering the kill-switch flow and the four ``CELERY_CONFIG``
shapes the warn-log helper now discriminates:
* ``test_kill_switch_off_skips_listener_registration`` — pins the
contract that ``ENABLE_VERSIONING_CAPTURE=False`` short-circuits
``init_versioning`` before either listener registers.
* ``test_warn_when_celery_beat_schedule_missing_retention_entry`` —
class-shaped config with the entry absent fires WARNING.
* ``test_no_warn_when_celery_beat_schedule_includes_retention_entry``
— class with the entry present is silent.
* ``test_no_warn_when_celery_config_is_none`` — Celery-disabled
deployment doesn't see the false-positive that motivated the v5
``isinstance / None`` guard.
* ``test_dict_form_celery_config_with_entry_does_not_warn`` — Celery's
documented dict-shape works.
* ``test_dict_form_celery_config_without_entry_warns`` — and the
dict-shape symmetry holds when the entry is missing.
**`tests/unit_tests/tasks/test_version_history_retention.py`** — four
tests pinning the statsd-emission contract on every branch:
* ``test_retention_disabled_emits_skipped_metric`` — ``retention_days=0``
fires ``.skipped``.
* ``test_no_versioned_classes_resolved_emits_skipped_metric`` — the
init-order-regression branch also fires ``.skipped`` (same metric on
purpose; dashboard alert is "no work happening", WARNING log carries
the why).
* ``test_serialization_failure_then_success_increments_retried_once``
— one ``OperationalError`` on attempt 1 fires ``.retried`` exactly
once, succeeds on attempt 2, records ``retried=1`` in the stats
dict, emits the ``.pruned_transactions`` gauge.
* ``test_all_attempts_fail_reraises_after_max_retries`` — exhausted
retries fire ``.retried`` exactly ``_MAX_RETRY_ATTEMPTS`` times and
re-raise so the outer Celery wrapper catches.
Total: 10 new tests, all passing in <1s wall-clock. Closes the v5 CD
+ clean-code finding ("the operational instrumentation that just
shipped isn't itself pipeline-gated by tests"). A future refactor that
restructures ``init_versioning`` or renames a metric now has to
deliberately update these tests rather than silently breaking the
contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ed models The versioning bootstrap probed each model with version_class() and swallowed any failure with a bare `except Exception: pass`. A model that failed to wire was silently dropped from VERSIONED_MODELS while the listeners still registered — change capture would degrade with no log, metric, or error, making the failure invisible to operators. Keep degraded-mode boot (don't fail startup), but surface the failure at WARNING with the model name and traceback so a broken wiring is visible in the deploy log. Surfaced by a Codex amin-review pass (M1). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
VERSIONED_MODELS is module-level state; test fixtures that initialize multiple Superset apps per process appended duplicate entries on each re-init. Guard the append. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
op.drop_index(if_exists=True) emits DROP INDEX IF EXISTS, which stock MySQL 5.7/8.x rejects (MariaDB-only grammar) — the downgrade raised a syntax error on the first table. Probe the inspector for existence instead, which keeps the partial-application robustness on every dialect. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The pre-migration graceful-degradation sites caught OperationalError only, but a missing relation raises ProgrammingError (UndefinedTable) on PostgreSQL — and the failed statement aborts the enclosing transaction there, so the user's save failed anyway despite the catch. Catch both exception classes and run each probe under a connection-level SAVEPOINT so a failure can't poison the outer transaction. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AuditMixinNullable stamps created_on/changed_on with naive datetime.now(); the restore stamp used datetime.utcnow(), skewing changed_on ordering on non-UTC servers (and utcnow is deprecated as of Python 3.12). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The natural-key upsert setattr'd every incoming property, so a payload 'id' on a name-matched column rewrote a live primary key, and a renamed column still carrying its old id INSERTed with a live PK while the old row's DELETE was pending in the same flush — INSERTs flush before DELETEs, colliding on the PK/UNIQUE constraints. Strip id/table_id in both branches; regression test included. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tion With override_columns, RefreshDatasetCommand commits its own transaction between the update and the new_version reads, so the response attributed the refresh's version to the user's update. Capture the identifiers before the refresh; re-read only the ETag afterwards (it must reflect the current live version). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ion_id) The dataset child-diff path (shadow_rows_valid_at + the prior-tx probe in shadow_queries.py) filters table_columns_version / sql_metrics_version by parent table_id plus a transaction-range bound. The base migration only indexed transaction_id / end_transaction_id / operation_type, and the live-row index leads with id — nothing served the table_id access pattern, so those queries fell to a seq scan as version history grew. Add a plain composite (table_id, transaction_id) on both child shadow tables (serves both queries on every dialect, no partial-index split). Folded into the existing shadow-index migration; round-trip test asserts the index is present on the child shadows and absent on the parents. Surfaced by a Codex sqlalchemy-review pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
make_versioned() runs at import of superset.extensions, so ENABLE_VERSIONING_CAPTURE=False skipped only the custom baseline/change-record listeners while Continuum's own mapper/session/engine listeners kept writing shadow and version_transaction rows on every save — contradicting the documented operator contract ('they just stop accumulating new rows'). Verified empirically: a chart save with the flag off grew slices_version and version_transaction.
Detach Continuum's write listeners in the flag-off branch. Deliberately a targeted subset of sqlalchemy_continuum.remove_versioning(): that helper also calls manager.reset(), which clears version_class_map — version_class() would then silently return the live model class and break the read-only /versions/ endpoints the flag promises to keep working. Verified both directions: flag off → no shadow/tx growth, reads intact; flag on → capture unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The marshmallow schemas in versioning/schemas.py were dead code drifting from the actual responses: the real rows carry version_uuid (absent from the schema), operation_type documented a 'restore' value that is never emitted, and the raw dicts went through Flask's jsonify — rendering issued_at as an RFC-1123 http-date instead of ISO-8601 and leaving version_uuid a UUID instance in the list but a string in the snapshot. Dump list rows and the snapshot's _version block through VersionListItemSchema (ISO datetimes, string UUIDs, single source of shape), add the missing version_uuid field, fix the operation_type description, register the schema in all three APIs' openapi_spec_component_schemas, and $ref it from the endpoint docstrings instead of bare 'type: object'. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
factory.py, api_helpers.py and UPDATING.md referenced superset.versioning.activity and /activity/ endpoints, which live on the follow-up branch (sc-107283) and don't exist in this change. For an apache/superset reviewer these point at nothing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Three cleanups to the unmerged migration, mirrored in changes/table.py: - Drop the parent shadow tables' audit columns (created_on/changed_on/created_by_fk/changed_by_fk, plus slices' last_saved_at/last_saved_by_fk): every one is in the models' __versioned__ exclude lists, so Continuum's runtime version Table omits them and nothing ever writes them — permanently-NULL drift that autogenerate would flag. Verified against a live app: no runtime shadow table carries any of these. - Widen version_changes.sequence SmallInteger → Integer: per-entity sequence is assigned by unbounded enumerate(); a pathological diff could overflow 32767. - Drop ix_version_changes_transaction_id: the UNIQUE(transaction_id, ...) constraint's backing index already serves transaction_id-prefix lookups. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
298a818 to
df8097a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SUMMARY
Adds backend plumbing to capture a version history for every save of a chart, dashboard, or dataset, and to expose that history via three new REST endpoints per entity (list, get, restore). No frontend in this PR. Second PR in the Versioning epic (sc-103156), depending on #39859 (composite-PK reshape on M2M association tables — sc-105349) and orthogonal to #39286 (sc-103157 soft-delete). See SIP-210 / issue #39492 for full design rationale.
🚧🚧🚧 This is still a draft/spike, not ready for final review. Branch contains two
temp(*)commits (demo UI dropdowns + French i18n; URL-param stripping on restore navigation) that will revert before merge. 🚧🚧🚧What changed:
Continuum wiring. Adds
sqlalchemy-continuumas a base dependency. Wired insuperset/extensions/__init__.pywith thevaliditystrategy; a customVersionTransactionFactoryrenames the transaction table toversion_transaction(the wordtransactionis reserved in several dialects) and aVersioningFlaskPluginsupplies the acting user viaget_user_id()(not Flask-Login'scurrent_user) so CLI / Celery / JWT-auth API saves all attribute correctly.Six shadow tables, all Continuum-native:
dashboards_version,slices_version,tables_versiontable_columns_version,sql_metrics_versiondashboard_slices_versionPlus one
version_transactiontable (the per-flush "who/when/where" envelope) and oneversion_changestable (structured diff records, FK toversion_transactionwithON DELETE CASCADE).Three endpoints per entity type (
/chart,/dashboard,/dataset):GET /api/v1/<resource>/<uuid>/versions/— list history (version_number,version_uuid,issued_at,changed_by)GET /api/v1/<resource>/<uuid>/versions/<version_uuid>/— one snapshot (scalar fields; pluscolumns/metricsfor datasets,slicesfor dashboards)POST /api/v1/<resource>/<uuid>/versions/<version_uuid>/restore— restore entity to that version<version_uuid>is a deterministic UUIDv5 (fixed namespace, derived from the entity UUID + Continuum transaction id). Stable across replicas and retention pruning — the same transaction always produces the same version uuid, so API consumers can cache references safely.ETag headers (
ETag: W/"<version_uuid>") on all three GET endpoints + the live entity GET. Foundation for optimistic-locking enforcement on writes (Phase 2); not enforced in this PR.Restore uses Continuum's native
Reverterwrapped in asingle_flush_scopecontext manager (suppresses autoflush inside the block, emits one trailing flush). The single-revert / single-flush shape was the spike outcome — earlier attempts at split-revert and JSON-snapshot tables were abandoned (seespike-continuum-restore.mdand the revised ADR-004 in the spec folder).Baseline capture. First save under versioning of an entity that pre-existed the migration inserts a synthetic
operation_type=0row capturing the pre-edit state, attributed to the entity's existingchanged_on/changed_by_fk. Listener runs before Continuum's ownbefore_flushso the baselinetransaction_idis lower than the edit's (correct ordering).No-op suppression. A
SkipUnmodifiedPluginmarks Continuum Operationsprocessed=Truewhen post-flush column values are content-equal to the previous shadow row — including JSON-aware comparison forDashboard.json_metadatathat strips frontend-stamped audit sub-keys (map_label_colors,chart_configuration, …) so saves that only re-stamp those don't pollute history.Force-parent-dirty on child changes. A
before_flushlistener flags the versioned parent (SqlaTable) as dirty when only its versioned children (TableColumn/SqlMetric) changed, so child-only edits surface in the parent's version dropdown.Structured change records. Every save writes per-field diff records to
version_changeskeyed to the sametransaction_id. Records carrykind/path/from_value/to_value— backbone for the Phase-2 UI's "Added column X" rendering, captured in V1 so the data is available from day one without a backfill.Retention is time-based, run by a Celery beat task.
SUPERSET_VERSION_HISTORY_RETENTION_DAYS(default90;0orNonedisables versioning entirely). Deletes shadow rows older than the cutoff while preserving the live row regardless of age. ON DELETE CASCADE onversion_changes.transaction_idkeeps diffs in sync. No write-path overhead; the prune is asynchronous.Composite PK reshape on M2M associations (sc-105349, PR refactor(db): composite PK on M2M association tables (sc-105349) #39859 — required for Continuum's M2M tracker to populate
dashboard_slices_versioncorrectly). The PRs are intended to merge in order refactor(db): composite PK on M2M association tables (sc-105349) #39859 → this one; the migration is included on this branch because the rebased history depends on it.Authorisation. Version endpoints reuse the resource's existing
can_writepermission. No new FAB permissions. Row-level access enforced viasecurity_manager.raise_for_ownership(entity)in the restore command.What is NOT versioned in v1 (see
specs/sc-103156-entity-versioning/future-work.md):position_jsonis versioned as an opaque blob (restored wholesale on dashboard restore); finer-grained layout versioning is Phase 2.Coordination with #39286 (sc-103157 soft-delete) — orthogonal in design; merge order can go either way. When sc-103157 merges, one small change hooks
deleted_atintofind_active_by_uuid()and the versioned models' Continuumexcludelists. Tracked as T043 in the spec.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A — backend-only. No UI is wired to the new endpoints yet. The
temp(*)commits add non-final demo dropdowns and i18n strings for manual testing only; they will revert before merge.TESTING INSTRUCTIONS
Expect an ordered array with
version_number,version_uuid,issued_at,changed_by. Response will include anETag: W/"<version_uuid>"header for the most recent version.Expect 200 + ETag header. A second request with
If-None-Match: "<that-etag>"returns 304.Expect 200.
GET /api/v1/chart/$CHART_UUIDshould now reflect the restored state, and a new version row (the restore itself) appears in the version list.Each change carries
{kind, path, from_value, to_value}.Repeat for dashboards and datasets using
/api/v1/dashboard/and/api/v1/dataset/. Datasets exercise child shadows (columns/metrics); dashboards exercise the M2M shadow (slices).Run the test suite:
pytest tests/integration_tests/charts/version_history_tests.py \ tests/integration_tests/dashboards/version_history_tests.py \ tests/integration_tests/datasets/version_history_tests.py \ tests/integration_tests/versioning/ -vAsserts the three Success Criteria: list < 1 s, restore < 3 s, save p95 overhead < 50 ms.
ADDITIONAL INFORMATION
Migration list (in dependency order):
2bee73611e32dashboard_slices+ 7 other association tables (sc-105349 / #39859)56cd24c07170version_transaction+ parent shadow tables (dashboards_version,slices_version,tables_version)e1f3c5a7b9d0version_changes(structured diff records, ON DELETE CASCADE FK toversion_transaction)f7a2b3c4d5e6table_columns_version,sql_metrics_version) + M2M shadow (dashboard_slices_version)All migrations are additive on the pre-existing
slices/dashboards/tables/ child tables — no existing columns altered. The composite-PK migration (2bee73611e32) reshapes the M2M association tables; round-trip tested on PostgreSQL, MySQL, and SQLite, including the MySQL FK /AUTO_INCREMENTquirks that required raw SQL workarounds (commits56c36fde54,65a3491861).Write cost per save:
version_transaction*_versionparent shadowSkipUnmodifiedPluginfiltered the save; 1 if scalars changedversion_changesNo write-path retention overhead — pruning is asynchronous via Celery beat.
Performance:
The numbers below were captured before the ADR-004 reversal (JSON-snapshot → full Continuum). Architecture has changed since — child writes now go through Continuum shadows instead of
dataset_snapshots/dashboard_snapshotsJSON tables. Re-validation against the final architecture pending before review. Targets unchanged:Harness:
SUPERSET_PERF_VALIDATION=1 pytest tests/integration_tests/versioning/perf_validation_tests.py -v -s.