Skip to content

ref: drop semantic mutate op phase3#269

Draft
hussainsultan wants to merge 11 commits into
mainfrom
hussain/feat/drop-semantic-mutate-op-phase3
Draft

ref: drop semantic mutate op phase3#269
hussainsultan wants to merge 11 commits into
mainfrom
hussain/feat/drop-semantic-mutate-op-phase3

Conversation

@hussainsultan

Copy link
Copy Markdown
Collaborator

No description provided.

hussainsultan and others added 11 commits June 9, 2026 21:35
… removal (ADR 0001 Phase 3)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…sure path (ADR 0001 Phase 3)

.mutate() survives as a thin desugaring alias with no operator node:
- SemanticAggregate.mutate folds derivations into the aggregate's aggs;
  the calc analyzer classifies them (measure refs/windows/totals -> calc,
  group-key expressions -> dimension-grain post-projection).
- Pre-aggregation mutate registers derivations as dimensions: lazily on
  join-backed models (preserving the pre-agg fan-out machinery), eagerly
  materialized on flat models.
- Post-aggregation chains (after filter/order_by/limit) resolve in chain
  order against the result table and wrap in a SemanticModel.

Planner cleanup: collect_mutates_to_join, the mutated_gb_keys heuristic,
and has_prior_aggregate's mutate arm are gone; unprefixed derived
dimensions now materialize per-table in the pre-agg grain computation.

_compile_aggregation gains: group-by keys on the virtual aggregated
table (windows can order by keys), sibling-aware classification scope
(calc-of-calc within one aggregate), and dimension-grain spec routing.

Serialization: SemanticAggregateOp reconstruction rebuilds query-local
lambdas from resolver structs; old SemanticMutateOp tags still
deserialize through the alias (backward compat).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- reconstruct: bare agg names replay only on exact or UNIQUE suffix
  match (mirrors _resolve_short_name); ambiguous suffixes fall through
  to struct deserialization
- ops: hoist calc_analyzer helper imports to module level
- expr: drop functools.reduce import orphaned by SemanticMutate removal
- chart/utils: docstring reflects that mutated_columns is a
  compat-only, always-empty slot

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
22 dead imports swept (ruff F401): 11 in ops.py, 3 in calc_compiler.py,
2 in serialization/__init__.py (Callable, Success — neither re-exported
nor in __all__), plus singles in reconstruct.py, server/api.py,
agents/backends/mcp.py, chart/echarts_adapter.py, and a chart test.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Replace manual try/except + Success/Failure wrapping in to_node_safe,
extract_column_names, and extract_columns_from_callable with the
returns @safe decorator. Drop the now-unused logging in projection_utils.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
xorq 0.3.25 added git-annex as a transitive dependency. It only
publishes a macosx_14_0_arm64 wheel for arm64, but nixpkgs
aarch64-darwin targets darwinMinVersion 11.3, so uv2nix's selectWheels
rejects it ("No compatible wheel, nor sdist found") and the editable
devShell fails to evaluate.

Referencing prev.git-annex throws eagerly (the format assert fires on
the empty wheel set), so define a fresh derivation built from the
prebuilt wheel, bypassing the platform check. No darwinMinVersionHook
is needed — the install phase only unzips the wheel.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A .mutate() chained after order_by/limit/filter on an aggregate
desugars through the base SemanticTable.mutate, which materializes the
result and previously rewrapped it via _build_semantic_model_from_roots.
That reattached the base model's dimensions (e.g. `origin`) whose exprs
reference source columns absent from the aggregated table, so any forced
field resolution — notably chart introspection going through ibis
__getattr__ → schema — raised AttributeError.

Build a flat model from the materialized columns instead: group-by keys
become identity dimensions (preserving time metadata), other columns
identity measures. Teach extract_aggregate_metadata to read dims/measures
off such a terminal flat model (no aggregate node survives), and hoist
.chart() onto the base SemanticTable so the result supports it (the
dedicated SemanticMutate operator that used to carry it was dropped in
Phase 3). Dedupes three identical chart() overrides.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The priority-measures refactor taught apply_calc_measures to accept
CalcMeasure specs and pull the lambda out of .expr, but
classify_calc_lambdas still handed the raw spec to
evaluate_calc_lambda. A spec is neither a Deferred nor callable, so it
fell through unevaluated, analyze_calc_expr failed, and the calc was
misclassified as not referencing t.all(...).

In the totals path that misclassification kept an AllOf calc (e.g.
percent_of_total) in the non_allof set, so the recursive
apply_calc_measures — called without agg_specs — re-detected the AllOf
reference and raised TotalsNotAvailableError. The bug was order-
sensitive (it surfaced in isolation but the full local suite masked it),
which is why CI caught it.

Extract .expr in classify_calc_lambdas, mirroring apply_calc_measures.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two related failures in the mutate-desugar path, surfaced by the
sessionized.md and bucketing.md doc queries:

1. A calc measure whose window references both a prior derived measure
   and a group-key/base column (e.g. session_id =
   is_session_start.sum().over(order_by=minute_offset)) evaluates to a
   bare Deferred: the virtual-aggregate scope resolves the derived
   measure onto the virtual table and the key onto the base table, so
   ibis can't bind the mixed-relation window. lift_inline_reductions
   then crashed walking the Deferred. The apply loop now first tries to
   evaluate such a calc against the materialized result table (where all
   referenced names already exist as columns), matching the historical
   post-aggregate .mutate() semantics, and only falls through to the
   re-lift path when that fails.

2. On a post-aggregate aggregation, a folded .mutate() derivation that
   references sibling aggregate outputs (e.g.
   session_end_min - session_start_min, or revenue / customers) probed
   against base_tbl — where those columns don't yet exist — and was
   wrongly routed to Table.aggregate. Route such specs to a
   post-aggregate mutate over the built result instead.

Both fixes restore behavior the dropped SemanticMutate operator used to
provide by materializing post-aggregate derivations. Docs build is green
again; full suite unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant