Skip to content

Commit 280e76f

Browse files
committed
fix(xsd): make Phase 3d content-model ingest idempotent
Compositors / child_edges / group_edges have no natural unique key (a complexType can hold sibling sequences/choices), so the prior pass unconditionally inserted on every run, doubling rows on the second ingest. CT_Tbl content lookups against a re-ingested DB returned 0 rows because the order_index ranges no longer matched what queries expected. Switching to delete-and-rewrite per profile at the start of pass 3: DELETE FROM xsd_compositors WHERE profile_id = ? DELETE FROM xsd_group_edges WHERE profile_id = ? xsd_child_edges cleans up automatically via FK CASCADE on compositor_id. Inheritance / symbols / memberships stay upsert-only since they have natural keys. Idempotency test now also asserts compositor / child-edge / group-ref counts in the DB match the first-run insert counts after a second run. Verified: two consecutive `bun run xsd:ingest` against the WML closure both produce 585 compositors / 2098 child edges / 161 group refs and the DB ends at exactly those counts.
1 parent 6cb04ac commit 280e76f

2 files changed

Lines changed: 21 additions & 1 deletion

File tree

scripts/ingest-xsd/ingest.ts

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -195,6 +195,16 @@ export async function ingestSchemaSet(opts: IngestSchemaSetOptions): Promise<Ing
195195
// emit xsd_compositors / xsd_child_edges / xsd_group_edges. Local element
196196
// declarations are deduped under (owner-vocab, name, element); cross-CT
197197
// reuse of a local name collapses to one symbol.
198+
//
199+
// Idempotency strategy: content-model rows have no natural unique key
200+
// (a single complexType can hold multiple sibling compositors of the same
201+
// kind), so we delete-and-rewrite per profile. xsd_child_edges FK on
202+
// xsd_compositors with ON DELETE CASCADE handles child_edges cleanup.
203+
// Assumes one source per profile, which holds today; revisit when
204+
// multiple sources contribute to the same profile.
205+
await sql`DELETE FROM xsd_compositors WHERE profile_id = ${profileId}`;
206+
await sql`DELETE FROM xsd_group_edges WHERE profile_id = ${profileId}`;
207+
198208
for (const decls of parseResult.declarationsByQName.values()) {
199209
for (const decl of decls) {
200210
if (decl.kind !== "complexType" && decl.kind !== "group") continue;

tests/ingest-xsd/ingest.test.ts

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -175,15 +175,25 @@ test("ingest is idempotent: re-running adds no new symbols/edges", async () => {
175175
expect(second.symbolsExisting).toBeGreaterThan(0);
176176
expect(second.profileMembershipsInserted).toBe(0);
177177
expect(second.inheritanceEdgesInserted).toBe(0);
178+
// Content-model passes use delete-and-rewrite, so insert counts equal
179+
// the first run on every re-run; DB row counts stay stable.
180+
expect(second.compositorsInserted).toBe(first.compositorsInserted);
181+
expect(second.childEdgesInserted).toBe(first.childEdgesInserted);
182+
expect(second.groupRefsInserted).toBe(first.groupRefsInserted);
178183

179184
// Row counts unchanged between first and second runs.
180185
const [c1] = await db.sql`SELECT COUNT(*)::int AS c FROM xsd_symbols`;
181186
const [c2] = await db.sql`SELECT COUNT(*)::int AS c FROM xsd_symbol_profiles`;
182187
const [c3] = await db.sql`SELECT COUNT(*)::int AS c FROM xsd_inheritance_edges`;
188+
const [c4] = await db.sql`SELECT COUNT(*)::int AS c FROM xsd_compositors`;
189+
const [c5] = await db.sql`SELECT COUNT(*)::int AS c FROM xsd_child_edges`;
190+
const [c6] = await db.sql`SELECT COUNT(*)::int AS c FROM xsd_group_edges`;
183191
expect(c1.c).toBe(first.symbolsInserted);
184-
// One membership per symbol per profile.
185192
expect(c2.c).toBe(first.profileMembershipsInserted);
186193
expect(c3.c).toBe(first.inheritanceEdgesInserted);
194+
expect(c4.c).toBe(first.compositorsInserted);
195+
expect(c5.c).toBe(first.childEdgesInserted);
196+
expect(c6.c).toBe(first.groupRefsInserted);
187197
});
188198

189199
test("ingest writes compositors and child edges for nested content models", async () => {

0 commit comments

Comments
 (0)