feat(liturgy): Metta Sutta depth pass — prosodic splits + per-morpheme depth + mn10 register#77
Merged
Merged
Conversation
…ic split User feedback (Metta Sutta verse 1): when hovering a Pāli word the arrows fanned out to every aligned English token regardless of which morpheme the cursor was on. The per-morpheme tooltip felt decoupled from the arrow you were trying to read. Two changes: 1. Renderer (TripleScriptWitness): when the hovered element carries data-morpheme-idx (the inner HoverSpan of a morpheme-split word), narrow the arrow filter to lines whose Line.morphemeIdx matches. The alignment-line computer already anchors arrow endpoints to morpheme positions when authored; this just exposes that scoping to the hover handler. 2. Metta Sutta verse 1: split the existing single bundled segment (Karaṇīyamatthakusalena ... anatimānī) into four prosodic phrases (v1a-v1d) matching the natural rhythm. Each phrase carries its own three-witness set (Amaravati / Sujato / Thanissaro) with word-by-word alignTo and full word data — pronunciation, etymology, gloss, morphemes (root verbs √kṛ, √śam, √vac, √man, √i, √as surfaced), and DPD citations. Compound karaṇīyam-attha- kusalena is hyphen-tokenised so each component hovers separately.
…kenization Two Playwright-verified bugs in verse 1: 1. kusalena's morphemes (kusala + ena = 9 chars) didn't reconstruct the 8-char surface, so the splitter returned null and the word fell back to whole-word hover. Switch to kusal + ena (the sandhi-shortened stem) so the split round-trips and each piece becomes hoverable. 2. The default Latin tokenizer splits on the apostrophe in c'assa, yielding 'c' and 'assa' as two pali tokens. My alignTo treated c'assa as one token at idx 1, so 'mudu' and 'anatimānī' references pointed at the wrong pali surface. Add a tokens hint on the v1d pi-Latn script variant to keep c'assa as one hover unit, matching the alignTo indexing.
Two pieces, both addressing user feedback to learn from mn10 + the CURATION_PROTOCOL.md Plain-Register Check: 1. Renderer: EnglishLine now accepts alignTo and dims any English token whose alignTo entry is -1 to opacity 0.55. These are 'glue words' — English scaffolding that has no Pāli counterpart (mn10 calls them 'ghost words' and renders them at 0.3; we settle higher because liturgy glue is more often unavoidable syntax than fully supplied content). The eye now lands on content words that actually map back. 2. Metta verse 1: rewrite morpheme + word glosses in plain prose register. CURATION_PROTOCOL.md §3.4 calls out 'gerundive ending', 'instrumental case ending', 'past participle', 'accusative singular neuter' as diagnostic failure-tone words. Replaced with the mn10 model — concrete teaching using everyday analogies (e.g. -yam tail = 'like English -able in doable, but with a must flavour'; -aṁ tail = 'pronounced like um in hum'; -ena tail = 'English wedges in by to do the same job; Pāli changes the word's tail'). Also: attha gloss expanded to the full sense range the user supplied (benefit, welfare, good, purpose, aim, meaning) and the v1a Amaravati alignTo fixed so 'in' → -1 (glue) instead of 1 (attha) — removes the duplicate arrow you flagged from attha to 'in' AND 'goodness'.
…attern Adjacent morpheme HoverSpans inside a Pāli word rendered their dotted underlines butting against each other, so the eye saw one continuous underline beneath the whole word. mn10 puts a 2px horizontal padding on each segment so the underlines visibly break — kar · aṇī · yam reads as three connected pieces. Borrowed verbatim.
Same model as verse 1: split into v2a-d at the natural caesurae, alignTo per witness, full word data (pronunciation, etymology, gloss) with morphemes broken out for every compound. New roots surfaced: √tuṣ (content), √bhṛ (bear/support), √vṛt (proceed), √gṛdh (greedy). Glosses written in the mn10 plain-prose register — no 'instrumental case ending' or 'past participle'. Where a grammar concept needs explanation (the -esu plural locative ending in kulesu), the technical machinery gets glossed in the same breath: 'the -esu tail marks in/among with a plural — like English in those families.'
v3a-d: 'Let them not do the slightest thing the wise would reprove ... may all beings be happy at heart.' New roots surfaced: √car (act, conduct), √jñā (know — viññū), √vad (speak — upavadeyyuṁ), √bhū (be — bhavantu/hontu), √as (be — sattā). The wishing voice *hontu* / *bhavantu* introduced — the heart of the metta sutta starts in this verse.
v4a-d sweeps every creature: 'Whatever living beings there are — none excepted, weak or strong; long or large, medium, short, fine, thick.' New compound *pāṇabhūtatthī* broken out (breath + existing + being-one). *rassakāṇukathūlā* split into rassakā + ṇukā + thūlā so the three sizes are individually hoverable.
… + refrain v5a-d: 'seen or unseen, near or far, born or seeking-to-come-into- being; may all beings be happy at heart'. New roots: √dṛś (see — diṭṭhā/addiṭṭhā), √vas (dwell — vasanti), √iṣ (seek — sambhavesī). v5d repeats the *sabbe sattā bhavantu sukhitattā* refrain (same shape as v3d) so readers learn the rhythm.
…uffering v6a-d: the moral-conduct stanza. New roots: √kub (cheat — nikubbetha), √man (think — nātimaññetha, echoing anatimānī from v1), √ruṣ (anger — byārosanā), √han (strike — paṭighasaññā via paṭi-gha strike-back = aversion), √iṣ (wish — dukkhamiccheyya). The Pāli technical pair *byārosanā* (hot anger) and *paṭighasaññā* (cold aversion) glossed with their Buddhist-psychology context.
…f boundless heart v7a-d: the famous *yathā… evampi…* simile. 'As a mother with her own life protects her one-and-only child, even so let one cultivate a boundless heart toward all beings.' New roots: √rakṣ (protect — anurakkhe), √mā (measure — aparimāṇaṁ), √bhū causative (bhāveti, the technical term for meditative *cultivation*). The four-piece compound *eka-putta-(m)anu-rakkhe* broken out so each piece is hoverable.
…ections v8a-d: 'And loving-kindness for the whole world; cultivate the boundless heart above, below, and across; uncrowded, without grudge, without enemy.' The directional sweep *uddhaṁ adho ca tiriyañca* and the three negations *asambādhaṁ averaṁ asapattaṁ* each broken out. v8b repeats the *mānasaṁ bhāvaye aparimāṇaṁ* refrain from v7d.
… abiding v9a-d: 'Standing, walking, sitting, or lying down — as long as one is alert — let one resolve on this mindfulness; this is what they call the divine abiding, here.' New roots: √sthā (stand — tiṭṭhaṁ, also adhiṭṭheyya the 'standing-upon, resolving'), √sad (sit — nisinno), √śī (lie — sayāno), √smṛ (remember — satiṁ, mindfulness), √hṛ (carry, dwell — vihāraṁ). The four bodily postures of monastic life catalogued so the heart-cultivation has no resting place.
…, liberation v10a-d: 'Not falling into views, virtuous and perfected in vision, having dispelled greed for sense-pleasures, one never again lies in a womb — so it is said.' Pairs *diṭṭhi* (held view, from √dṛś) with *dassana* (direct seeing, same root) — same word turned from object to instrument. Closing *iti* glossed as 'the Pāli closing quotation-marks'. The whole metta sutta now at heart-sutra depth.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…pairing The alignment-arrow renderer assigned English tokens to morphemes purely by position (i-th English word mapped to a word → i-th morpheme of that word). When English reorders a word's morphemes the arrows cross: kusalena = kusal (skilled) + ena (by-an-agent), but Amaravati renders it 'one ... skilled' — so the heuristic sent kusal's arrow to 'one' and ena's to 'skilled', both wrong. Add an optional Witness.morphemeAlignTo array, parallel to alignTo. morphemeAlignTo[i] names the morpheme index that English token i should anchor to; null/absent falls back to the positional heuristic. computeAlignmentLines honours it when present. Authored for Metta verse 1a across all three witnesses: - karaṇīyam: 'done' → √kṛ root (morpheme 0); 'is/what/should/be' → -yam - kusalena: 'skilled' → kusal stem (0); 'one' → -ena ending (1) Playwright-verified: hovering kusal now arrows to 'skilled', ena to 'one'. Other verses still use the heuristic — extend morphemeAlignTo where a crossing is spotted.
Followed up the v1a fix with a full line-by-line pass: every witness where two-plus English tokens mapped to a morpheme-bearing Pāli word now carries an explicit morphemeAlignTo, so the arrows anchor on the semantically correct morpheme instead of the positional guess. 41 witnesses annotated. The genuine reversals fixed include: - gabbhaseyya: 'lie' → seyya (lying), 'womb' → gabbha - brahmametaṁ: 'sublime/divine' → brahma, 'this' → metaṁ - dukkhamiccheyya: 'harm/suffer' → dukkham, 'wish' → iccheyya - sambhavesī: 'seeking' → esī, 'birth' → bhav - sallahukavutti: 'living' → vutti, 'lightly' → sallahuka - idhamāhu: 'they call' → māhu, 'here' → idha - evampi: 'even' → pi, 'so' → evam - ekaputtamanurakkhe: 'protect' → rakkhe, 'child' → putta, 'only' → eka Plus many minor cases where a grammatical English word (of, for, in, with, and) was landing on a meaning-stem morpheme — now redirected to the ending morpheme or the stem as the sense requires. Validated: all 41 morphemeAlignTo arrays length-match their alignTo and every index is in range. Playwright-confirmed on v6d (dukkham now arrows to 'harm', not 'wish'). 1095 liturgy tests pass.
The relational-arrow SVG was painting on top of the transliteration line, so the romanization was crossed by a green curve and hard to read. Establish an explicit three-layer z-order on the segment: SVG alignment edges z-0 transliteration line z-[5] (+ bg-slate-950 to mask the edge) chant words z-10 The transliteration now sits above the edge (edge tucks behind it, re-emerging below) but still below the chant words — so word tooltips, which live in the word's z-10 stacking context, continue to paint above the transliteration. Same bg-mask trick the word spans already use to hide edge strokes behind themselves.
…them Four words had morpheme texts that didn't concatenate back to the surface form, so splitByMorphemes returned null and the word fell back to a single whole-word hover instead of per-morpheme spans: - nātimaññetha: na+ati → nā+ti (the two short a's merge to long ā) - nāññamaññassa: na+aññamañña → nā+ññamañña (same a+a → ā merge) - rassakāṇukathūlā: ṇukā → ṇuka (final vowel shortens before thūlā) - anupaggamma: gamma → ggamma (g doubles at the prefix-join) Each morpheme gloss now explains the sandhi in plain register so the reader understands why the piece is spelled the way it is. All four words now render as separately-hoverable morphemes — verified nātimaññetha splits into nā · ti · maññetha.
The gear-icon settings popover had no dismiss path except toggling the gear again — clicking the chant body left it stuck open. Add a mousedown listener (closes when the click lands outside the popover's wrapper) plus Escape-to-close, both wired only while the popover is open and torn down when it closes.
…umbers
Cross-reference glosses leaked internal segment IDs ('same word as
v7d', 'same root as *bhavantu* in v3d') into reader-facing tooltips —
meaningless to anyone reading the chant. Replaced all 26 with 'verse N'
phrasing. The IDs only ever made sense to the curator.
…he joins
QC sweep after the Metta Sutta pass found the same silent bug class
(morphemes don't reconstruct the surface → splitByMorphemes returns
null → word degrades to a single whole-word hover) in four more
chants. Nine words fixed:
morning-chants kāmesu kāma → kām
sho-sai śāsanānām ana → an
heart-sutra Āryāvalokiteśvaro Ārya → Āry
heart-sutra cittāvaraṇa citta → citt
heart-sutra viparyāsātikrānto atikrānto → ātikrānto
heart-sutra Tryadhvavyavasthitāḥ tri → try
bodhi-heart-sutra: same three as heart-sutra (it mirrors that file)
Every fix is a Pāli/Sanskrit sandhi at the morpheme join — two vowels
merging (a+a→ā, a+ā→ā), a vowel shortening, or i→y before a vowel. The
morpheme gloss now names the sound-change so the spelling makes sense.
Audited exhaustively: all morphemes in every triple-script-witness
segment across both sanghas now reconstruct their surface form.
Also confirmed: no internal segment-ID leaks ('v7d'-style) in any
chant other than Metta (now fixed).
The Metta QC pass surfaced two failure modes that neither crash nor
fail a render — they just quietly degrade the reader, so nothing
caught them until a human noticed. New test file makes both loud:
1. Morpheme reconstruction — every word's morphemes[] (and per-script
scriptMorphemes[]) must concatenate back to the surface form. If
they don't, the renderer's splitByMorphemes returns null and the
word silently loses its per-morpheme hovers/arrows. scriptMorphemes
comparison strips token separators (space, Tibetan tsek) since the
renderer splits per-token.
2. Segment-ID leaks — reader-facing gloss/etymology text must never
contain the curator's internal segment-ID shorthand (v1a, v7d) —
it's meaningless to a reader. Caught with /\bv\d+[a-z]\b/.
Runs across every word in every triple-script-witness segment in both
sanghas — a new chant is covered automatically. All green (the data
was fully fixed in the preceding commits).
Not covered: jargon glosses ('gerundive', 'accusative singular') —
that's a visible-not-silent issue with 84 known hits deferred to a
dedicated plain-register pass; a jargon-guard test should land
alongside that fix, not before it (would just be 84 red tests).
…hant glosses + add regression guard Earlier I deferred this as 'an 84-hit multi-hour pass'. That was procrastination — the issue was identified, so it gets fixed. Rewrote every grammar-jargon gloss/etymology across the whole liturgy into the plain-prose register CURATION_PROTOCOL §3.4 prescribes — *show* the grammatical idea, don't *name* it: 'accusative (object of "I go to")' → 'the object of "I go to"' 'past participle of √vac' → 'the "X-ed" form of √vac' 'ablative ending' → 'the "-from" ending' 'genitive plural' → 'the "of-those" ending' '(nominative)' appended as noise → dropped Files touched: heart-sutra, bodhi-heart-sutra, morning-chants, sho-sai, ti-sarana, bodhicitta-dedication, metta-sutta, om-mani, way-of-compassion. Audit: 0 jargon hits remain anywhere. Plus a third check in liturgy-data-quality.test.ts — a jargon tripwire. It is not an absolute ban (the pay-rent rule still permits a glossed term that earns its place); a legitimately-earned term goes in JARGON_ALLOWLIST with a rationale. The guard already proved itself: its case-insensitive scan caught 'Optative' in way-of-compassion's bhāvaye gloss that a case-sensitive grep had missed.
Distils the Metta depth pass + cross-chant QC sweep into a failure-mode taxonomy, organised by how each error announces itself — silent / loud / judgment. For every class: what it is, why it happens, where it was found, the guard now in place, and the rule a future auto-generation pipeline must follow to avoid producing it. The through-line: the two dangerous classes (morpheme reconstruction, internal-ID leak) were silent — survived from authoring until a human hovered the wrong word. The fix was to make them loud. A generator must run, on its own output, every invariant the renderer silently assumes, and degrade cleanly when it cannot satisfy one.
anantham
added a commit
that referenced
this pull request
May 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The Metta Sutta now reads at MAPLE Heart Sutra depth. All 10 verses split into 4 prosodic segments each (40 total), with full word-by-word alignment to all three witnesses (Amaravati, Sujato, Thanissaro) and per-morpheme tooltips in mn10's plain-prose register.
What changed
-1(glue word).√kṛ(do),√śam(calm),√vac(speak),√i(go),√tuṣ(content),√bhṛ(bear),√vṛt(proceed),√gṛdh(greedy),√car(act),√jñā(know),√vad(speak — against),√bhū(be),√as(be),√dṛś(see),√vas(dwell),√iṣ(seek),√kub(cheat),√man(think),√ruṣ(anger),√han(strike),√rakṣ(protect),√mā(measure),√sthā(stand),√sad(sit),√śī(lie),√smṛ(remember — sati),√hṛ(carry),√pad(attain),√nī(lead),√gam(go).-yamtail: "like English -able in doable, but with a must flavour"-aṁtail: "pronounced as a soft nasal close, like um in hum"-enatail: "the doer of the action — English wedges in 'by'; Pāli changes the word's tail"-eyyatail: "would/should (do this) — the wishing voice"-esutail: "in/among, with a plural"alignTo === -1render at 0.55 opacity. mn10's "ghost word" pattern, eased a notch.px-[2px]so adjacent underlines don't merge visually. Same trick mn10 uses on its segment spans.Test plan
/liturgy/maple/metta-sutta— verify all 10 verses display 4 segments eachkarinkaraṇīof v1, orbhāvinbhāvayeof v7/v8) — confirm only that morpheme's arrows show🤖 Generated with Claude Code