Skip to content

IxVM: cut Aiur-kernel ingress cost (per-pass width + CheckEnv ingest-once)#432

Merged
arthurpaulino merged 2 commits into
mainfrom
ap/ingress
Jun 3, 2026
Merged

IxVM: cut Aiur-kernel ingress cost (per-pass width + CheckEnv ingest-once)#432
arthurpaulino merged 2 commits into
mainfrom
ap/ingress

Conversation

@arthurpaulino
Copy link
Copy Markdown
Member

@arthurpaulino arthurpaulino commented Jun 2, 2026

Two complementary optimizations to the IxVM Aiur kernel's data-ingress path —
the layer that turns prover-supplied bytes into kernel types before typechecking.
Pure performance: no change to what is accepted/rejected.

All numbers measured with ix check --ixe arena.ixe --claim <CheckEnv-digest>
over the TutorialDefs arena env (694 constants), constrained blake3:

stage total Aiur FFT cost Δ
baseline (main) 7,159,028,984
after commit 1 (width) 6,196,095,030 −13.5%
after commit 2 (ingest-once) 3,319,451,741 −53.6% overall

lake test -- --ignored ixvm: 297 pass, 0 fail (incl. adversarial arena BAD
fixtures still rejected). lake test -- --ignored rbtree-map: pass.


Commit 1 — shrink the per-pass ingress/conversion circuit width

Aiur circuit cost is width × height × log2(height), width = inputSize + selectors + auxiliaries + 4·(1+lookups). This commit cuts width on the hottest
ingress/conversion functions:

  • convert_expr/ctx_convert_expr take e: &Expr (pointer, load inside)
    instead of the Expr flattened by value — an enum arg is laid out at its widest
    constructor, so this cuts inputSize: width 112 → 68 (−39%). Recursive arms
    forward child pointers; the Share arm forwards the interned pointer (drops a
    load). Memoization identity preserved (pointer ↔ interned value).
  • RBTreeMap lookup/lookup_or_default/ins: replace the second
    u32_less_than(k, key) with a field equality key - k == 0 (field trichotomy
    ⇒ equivalent), saving +12 aux / +6 lookups per node. lookup_or_default 85→62w,
    ins 98→75w.
  • rbtree_map_balance: factor the 4 wide Okasaki rebalance cases into a cold
    rbtree_map_balance_fix; the red-parent hot path no longer carries the rebalance
    machinery (Aiur lays width as the max over arms). balance 122 → 32w.
  • lookup_addr_pos: ref-address→kernel-position was an O(N) linear
    address_eq scan; now a ptr_val-keyed map fast path with a content-based
    fallback (lookup_addr_pos_linear) on miss — sound because a ptr_val hit
    implies content equality, and a de-interned/blob miss falls back to the content
    scan.
  • Consolidate two ptr_val maps into one: delete build_addr_set
    addr_pos_map already keys every constant address, so is_blob is a presence
    test against it. Also removes dead helpers (find_block_addr_from_refs,
    find_matching_block_addr, count_block_ctors, recr_rule_count,
    count_recr_rules). Documents the ptr_val-key soundness invariant.

Commit 2 — CheckEnv ingests the env once instead of per leaf

check_env_iter re-ran the full ingress_with_primitives(leaf) pipeline for
every env leaf, re-deriving the layout/positioning/convert/addr-table passes
over each leaf's heavily-overlapping closure. Those passes are keyed by per-leaf
list arguments, so Aiur's content-memoization did not collapse them across leaves
(only the content-keyed load_verified_* / convert_* were deduped). The
addressing machinery effectively ran ~N times over overlapping data.

This factors the shared post-load pipeline into finish_ingress and adds
ingress_env, which loads the union closure of all env leaves in one
load_with_deps (leaves[0] as target, the rest as the initial worklist).
run_check_env now mirrors run_check: ingest once, then check_all_skipping.

Soundness: the checking is unchanged — every constant is still checked exactly
once (check_const runs 711× = env consts + primitives), and the single
check_canonical_block_sort(top) over the union is a stronger global order than
the per-leaf closures it replaces. Refs resolve by content, so the union check
cannot accept anything a per-leaf check would reject. Structurally identical to a
run_check over a closure that happens to be the whole env, so soundness rides on
that existing, tested pattern.


How the two interact

Complementary, no overlap in code (commit 2 reuses the functions commit 1
optimized via finish_ingress). Different axes: commit 1 lowers the cost of one
ingress pass; commit 2 lowers the number of passes (N→1) for CheckEnv. Their
effects overlap on the CheckEnv path (both attack ingress cost), so the deltas are
not strictly additive — ingest-once subsumes most of commit 1's CheckEnv value.
Commit 1 remains the relevant ingress win for the single-claim path
(ix check <name>run_check), where there is no per-leaf multiplication for
ingest-once to remove.

Scope / notes

  • Pure performance + dead-code removal; no change to acceptance/rejection.
  • After these changes the ingress/convert/layout layer is ~1.7% of CheckEnv cost;
    the remaining cost is dominated by the irreducible floor (blake3 content
    verification ~46%) and genuine kernel typecheck work (expr_inst1, k_infer,
    whnf telescopes). Further ingress optimization has diminishing returns.
  • Follow-up worth adding: a dedicated CheckEnv soundness test embedding a
    known-bad constant in the env and asserting rejection (current coverage checks
    good/bad via verify_const and the arena claim end-to-end).

FFT-cost optimizations on the kernel's ingress + Ixon-to-kernel-type
conversion path (the dominant cost when verifying a whole environment of
many constants).

Measured with `ix check --ixe arena.ixe --claim <digest>` over the
TutorialDefs arena env (780 constants): total Aiur FFT cost
7,159,028,984 -> 6,196,095,030 (-13.5%). No behavior change: the
check-env claim typechecks identically and the rbtree-map test passes.

convert_expr: pass the expression by pointer
- convert_expr / ctx_convert_expr take `e: &Expr` and `load` inside,
  instead of receiving the Expr flattened by value on every call. An
  enum-valued argument is laid out at its widest constructor, so this cuts
  inputSize: width 112 -> 68 (~-39% on that circuit). Recursive arms
  forward child pointers directly and the Share arm forwards the interned
  pointer (dropping a load). Memoization identity is preserved, since a
  pointer and its interned value coincide.

RBTreeMap: one comparison per node instead of two
- lookup / lookup_or_default / ins replace the second `u32_less_than(k, key)`
  with a field equality `key - k == 0`. By field trichotomy this is
  equivalent, and it saves +12 aux / +6 lookups per node (u32_less_than is
  the expensive gadget). lookup_or_default 85 -> 62w, ins 98 -> 75w.

RBTreeMap: factor the cold rebalance path out of balance
- rbtree_map_balance becomes a narrow color dispatch; the four wide
  Okasaki rebalance cases (red-grandchild deref + 3-node rebuild) move to
  rbtree_map_balance_fix. Aiur lays a function's width as the max over its
  match arms, so the red-parent hot path no longer carries the rebalance
  machinery: balance 122 -> 32w.

lookup_addr_pos: O(log) fast path with a sound content fallback
- The ref-address -> kernel-position lookup was an O(N) linear address_eq
  scan (the largest single ingress hotspot). It now first probes an
  interned ptr_val-keyed map (build_addr_pos_map, value = pos+1 so the
  default 0 means absent) and only falls back to the content-based
  address_eq scan (lookup_addr_pos_linear) on a miss. This is sound: a
  ptr_val hit implies content equality (the positive direction), and a
  de-interned pointer or a non-constant blob ref simply misses and takes
  the content scan, which returns 0 if truly absent. Honest constant refs
  hit the tree, so the linear scan is reduced to blob/fallback rows.

Consolidate the two ptr_val membership/position maps into one
- Delete build_addr_set: addr_pos_map already keys every constant address,
  so is_blob becomes a presence test against it (present iff constant; the
  stored position is irrelevant there). This removes a whole map build from
  ingress. The de-intern direction stays conservative -- a de-interned
  constant reads as a blob, which is content-bound and harmless. Also
  removes now-dead helpers find_block_addr_from_refs,
  find_matching_block_addr, count_block_ctors, recr_rule_count and
  count_recr_rules (only self-recursive; the recursor block is resolved via
  rec_typ_to_inductive_addr).

Document the ptr_val-key soundness invariant on build_addr_pos_map: a
pointer maps to exactly one stored value, so distinct contents can never be
conflated; a de-intern only ever makes a present key read as absent. Hence
ptr_val keys are safe only where a false "absent" is conservative -- which
is why lookup_addr_pos keeps its content-based fallback.
`check_env_iter` re-ran the full `ingress_with_primitives(leaf)` pipeline
for every env leaf, re-deriving the layout/positioning/convert/addr-table
passes over each leaf's (heavily overlapping) closure. Those passes are
keyed by per-leaf list arguments, so Aiur's content-memoization did not
collapse them across leaves — only the content-keyed `load_verified_*` /
`convert_*` were deduped. So the addressing machinery ran ~N times over
overlapping data.

Ingress the union closure of all env leaves ONCE (one `load_with_deps`
with leaves[0] as target and the rest as the initial worklist), then check
every constant via the same `check_all_skipping` path `run_check` already
uses. Factor the shared post-load pipeline into `finish_ingress` and add
`ingress_env`; `run_check_env` now mirrors `run_check` over a closure that
happens to be the whole env.

Soundness: unchanged checking. Every constant is still checked once
(`check_const` runs 711×, = env consts + primitives), and the single
`check_canonical_block_sort(top)` over the union is a stronger global order
than the per-leaf closures it replaces — refs resolve by content, so the
union check cannot accept what a per-leaf check would reject. The adversarial
arena BAD fixtures still reject.

Measured with `ix check --ixe arena.ixe --claim <CheckEnv-digest>` over the
TutorialDefs arena (694 consts): total Aiur FFT cost
6,196,095,030 -> 3,319,451,741 (-46.4%). `lake test -- --ignored ixvm`:
297 pass, 0 fail.
@arthurpaulino arthurpaulino changed the title IxVM: cut ingress/conversion circuit width in the Aiur kernel IxVM: cut Aiur-kernel ingress cost (per-pass width + CheckEnv ingest-once) Jun 2, 2026
@arthurpaulino arthurpaulino merged commit 3fe83fe into main Jun 3, 2026
14 checks passed
@arthurpaulino arthurpaulino deleted the ap/ingress branch June 3, 2026 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants