Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,29 @@ class SomeManager(...) {
val manager = remember { SomeManager() }
```

### Uncaught coroutine Throwables kill the process on Android — guard long-lived scopes

An uncaught `Throwable` (notably `OutOfMemoryError`) escaping any coroutine reaches the platform default uncaught-exception handler. **On Android that handler kills the process ("app keeps stopping"); on desktop JVM it only prints** — so this class of crash never reproduces on desktop. Under heap pressure the OOM is thrown in whichever coroutine allocates next, not necessarily the one doing the heavy work, so per-call-site `catch(Throwable)` is not sufficient.

Rules:
- Every long-lived `CoroutineScope` that hosts user-path collectors or fire-and-forget launches must attach a `CoroutineExceptionHandler` (see `StelekitViewModel.scope`, `GraphLoader.parallelScope`). Surface errors as `fatalError` UI state where possible.
- Standing `collect { }` bodies and `stateIn` upstream chains on such scopes are the unguarded vectors — a repository flow's `catchDbError()` does not protect them.
- Regression tests: `StelekitViewModelCrashReproductionTest`, `PageNameIndexResilienceTest`, `LargeGraphWarmStartCrashTest` (8 030-page warm start with a recording default uncaught-exception handler).

### Graph-scale reads must be paginated, projected, or chunked — never O(graph)

Every DB write invalidates SQLDelight queries on the written table, so a standing collector of an unbounded query re-materializes its **entire result set per write burst**. During graph import/reconcile on an 8 000+ page graph this causes GC thrash (UI hang) and `OutOfMemoryError` (crash) on Android. **`PageRepository` therefore has no `getAllPages()` / unbounded `getUnloadedPages()` at all — the absence is compile-time enforced.** Do not add unbounded reads back to any repository interface.

Patterns, by consumer type:
- **Standing UI observers** (sidebar, etc.): bounded queries only — `getFavoritePages()` (`WHERE is_favorite = 1`), `getPages(limit, offset)`, `getPageByUuid` point lookups.
- **Standing whole-graph observers** (e.g. `PageNameIndex`): use a **projection** (`getPageNameEntries()` — name + is_journal only), plus `conflate()` + `distinctUntilChanged()` + debounce as backpressure, plus `Throwable` guards.
- **Bulk reconcile** (`GraphLoader.loadDirectory`): per-chunk `IN`-clause lookups — `getPagesByNames(chunk)` / `getJournalPagesByDates(chunk)` — never a full-table preload. `IN` lists chunked ≤500 (`SQLITE_MAX_VARIABLE_NUMBER` = 999 on Android API < 30).
- **Background indexing** (`GraphLoader.indexRemainingPages`): drain loop over `getUnloadedPages(limit, offset)` (`INDEX_BATCH_SIZE` = 100); offset advances past permanently-failing rows via an attempted-UUID set so the loop is guaranteed to terminate; `countUnloadedPages()` provides the O(1) progress denominator.
- **Whole-graph one-shots** (export, migration tooling, benchmarks, tests): `getAllPagesSnapshot()` — a suspend interface method that pages through `getPages(limit, offset)` in bounded batches (never a single unbounded query, never a reactive flow).
- Do not pin full-table snapshots in fields (the former `cachedAllPages` pattern is forbidden).

Regression tests: `LargeGraphWarmStartCrashTest` (asserts ≤100-row batches across a full 8 030-page warm start), `GraphLoaderIndexBatchingTest` (bounded drain + termination with permanently-failing pages), `QueryPlanAuditTest` (audits query plans for the bounded query set).

### Android Application.onCreate — catch Throwable, not Exception

`Application.onCreate()` must use `catch (e: Throwable)`, not `catch (e: Exception)`. Native library loading failures (`UnsatisfiedLinkError`, `NoClassDefFoundError`) are `Error` subclasses, not `Exception`. Catching only `Exception` lets them propagate uncaught and crash the app at startup before the UI is shown. See `SteleKitApplication.kt`.
Expand Down
4 changes: 4 additions & 0 deletions androidApp/src/main/AndroidManifest.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,12 @@
<!-- SAF (Storage Access Framework) grants are given at runtime via ACTION_OPEN_DOCUMENT_TREE
and stored as persistable URI permissions — no manifest storage permissions needed. -->

<!-- largeHeap: graph import/reconcile holds parse buffers plus the page-name suggestion
trie in memory at once; on 8 000+ page graphs the default per-app heap (192-256 MB
on most devices) is too tight and OutOfMemoryError kills startup. -->
<application
android:name="dev.stapler.stelekit.SteleKitApplication"
android:largeHeap="true"
android:allowBackup="false"
android:icon="@mipmap/ic_launcher"
android:roundIcon="@mipmap/ic_launcher_round"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ class AndroidGraphBenchmark {

val phase3Ms = measureTime { loader.indexRemainingPages {} }.inWholeMilliseconds

val pageCount = repoSet.pageRepository.getAllPages().first().getOrNull()?.size ?: 0
val pageCount = repoSet.pageRepository.getAllPagesSnapshot().getOrNull()?.size ?: 0

android.util.Log.i("ANDROID_BENCH", """{"metric":"loadPhase","phase1Ms":$phase1Ms,"phase3Ms":$phase3Ms,"pageCount":$pageCount}""")

Expand Down Expand Up @@ -205,7 +205,7 @@ class AndroidGraphBenchmark {
loader.indexRemainingPages {}

// Pick a journal page to edit
val pages = repoSet.pageRepository.getAllPages().first().getOrNull() ?: emptyList()
val pages = repoSet.pageRepository.getAllPagesSnapshot().getOrNull() ?: emptyList()
val journalPage = pages.firstOrNull { it.isJournal } ?: pages.first()
val pageUuid = journalPage.uuid

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,39 @@ class ExportServiceJournalRangeTest {
assertTrue(result.isLeft(), "Expected Left for empty range but got Right")
}

// ── U-EJR-04: batch with null journal date terminates without hanging ────

@Test
fun uEJR04_batchWithNullJournalDate_terminatesWithoutHanging() = runTest {
// A journal page with is_journal=true but no journal_date (corrupted/migrated data).
// Before the fix, the drain loop would never break on oldest==null — infinite scan.
val pageRepo = InMemoryPageRepository()
val blockRepo = InMemoryBlockRepository()

val badPage = Page(
uuid = PageUuid("p-null-date"),
name = "corrupted-journal",
isJournal = true,
journalDate = null, // triggers oldest==null early-exit guard
createdAt = now,
updatedAt = now,
)
pageRepo.savePage(badPage)
blockRepo.saveBlock(block("b-null", "some content", "p-null-date"))

val service = makeService()
// The call must terminate (not hang) even when the page has no journal_date.
// The result is Left because no pages fall within a valid date range.
val result = service.exportJournalRange(
from = LocalDate(2026, 1, 1),
to = LocalDate(2026, 1, 7),
formatId = "markdown",
pageRepo = pageRepo,
blockRepo = blockRepo,
)
assertTrue(result.isLeft(), "Expected Left when only null-date journal pages exist, got Right")
}

// ── U-EJR-03: pages in range with no blocks → second Left path ───────────

@Test
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ class VoiceNoteBlockFormatTest {
"#+BEGIN_QUOTE must not appear in inline block")

// Verify no transcript page was created (below threshold)
val allPages = pageRepo.getAllPages().first().getOrNull().orEmpty()
val allPages = pageRepo.getAllPagesSnapshot().getOrNull().orEmpty()
val transcriptPages = allPages.filter { it.name.startsWith("Voice Note ") }
assertTrue(transcriptPages.isEmpty(),
"Expected no Voice Note transcript page for short (below-threshold) note")
Expand Down
137 changes: 90 additions & 47 deletions kmp/src/commonMain/kotlin/dev/stapler/stelekit/db/GraphLoader.kt
Original file line number Diff line number Diff line change
Expand Up @@ -604,54 +604,76 @@ class GraphLoader(
backgroundIndexJob = currentCoroutineContext()[Job]
PerformanceMonitor.startTrace("indexRemainingPages")
try {
val unloadedPages = pageRepository.getUnloadedPages().first().getOrNull() ?: emptyList()
if (unloadedPages.isEmpty()) return
val total = pageRepository.countUnloadedPages().getOrNull() ?: 0L
if (total == 0L) return

logger.info("Background indexing ${unloadedPages.size} pages... (${heapSummary()})")
logger.info("Background indexing $total pages... (${heapSummary()})")

coroutineScope {
var processed = 0
val total = unloadedPages.size

unloadedPages.chunked(10).forEach { chunk ->
val pagesToSave = mutableListOf<Page>()
val blocksToSaveByPage = mutableMapOf<PageUuid, MutableList<Block>>()
val pageUuidsToDelete = mutableSetOf<PageUuid>()

chunk.map { page ->
async(backgroundIndexDispatcher) {
if (page.uuid.value in (activePageUuids?.value ?: emptySet())) {
logger.debug("Phase 3: skipping ${page.name} — active edit session")
return@async null
var processed = 0L
// Drain in bounded batches instead of materializing every unloaded Page up
// front (8 000+ objects on a first warm start — an Android OOM contributor).
// Successfully indexed pages leave the unloaded set, so each fetch re-reads
// at a fixed limit. Pages that stay unloaded after an attempt (missing file,
// parse error, active edit session, zero-block parse) are remembered in
// `attempted` — UUID strings only, a few hundred KB worst case — and the
// offset advances past them when they are re-fetched. Termination is
// guaranteed: every iteration either indexes a fresh page or grows the
// offset by the full batch of stuck rows.
val attempted = HashSet<String>()
var offset = 0
while (true) {
val batch = pageRepository
.getUnloadedPages(INDEX_BATCH_SIZE, offset)
.first().getOrNull().orEmpty()
if (batch.isEmpty()) break

val fresh = batch.filter { attempted.add(it.uuid.value) }
// Re-fetched rows we already attempted are stuck for this run — move the
// drain window past them so a fetch can never return only stuck rows.
offset += batch.size - fresh.size
if (fresh.isEmpty()) continue

fresh.chunked(10).forEach { chunk ->
val pagesToSave = mutableListOf<Page>()
val blocksToSaveByPage = mutableMapOf<PageUuid, MutableList<Block>>()
val pageUuidsToDelete = mutableSetOf<PageUuid>()

chunk.map { page ->
async(backgroundIndexDispatcher) {
if (page.uuid.value in (activePageUuids?.value ?: emptySet())) {
logger.debug("Phase 3: skipping ${page.name} — active edit session")
return@async null
}
val path = page.filePath ?: resolvePageFilePath(page.name)
if (path == null) return@async null
val content = readFileDecrypted(path) ?: return@async null
try {
parsePageWithoutSaving(path, content, ParseMode.FULL)
} catch (e: CancellationException) {
throw e
} catch (e: Exception) {
logger.warn("Failed to parse file: $path: ${e.message}")
null
}
}
val path = page.filePath ?: resolvePageFilePath(page.name)
if (path == null) return@async null
val content = readFileDecrypted(path) ?: return@async null
try {
parsePageWithoutSaving(path, content, ParseMode.FULL)
} catch (e: CancellationException) {
throw e
} catch (e: Exception) {
logger.warn("Failed to parse file: $path: ${e.message}")
null
}.awaitAll().forEach { result ->
if (result != null) {
pagesToSave.add(result.page)
if (result.blocks.isNotEmpty()) {
blocksToSaveByPage[result.page.uuid] = result.blocks.toMutableList()
}
pageUuidsToDelete.add(result.page.uuid)
}
}
}.awaitAll().forEach { result ->
if (result != null) {
pagesToSave.add(result.page)
if (result.blocks.isNotEmpty()) {
blocksToSaveByPage[result.page.uuid] = result.blocks.toMutableList()
}
pageUuidsToDelete.add(result.page.uuid)

if (pagesToSave.isNotEmpty() || pageUuidsToDelete.isNotEmpty()) {
flushChunkWritesPreemptible(pagesToSave, pageUuidsToDelete, blocksToSaveByPage)
}
}

if (pagesToSave.isNotEmpty() || pageUuidsToDelete.isNotEmpty()) {
flushChunkWritesPreemptible(pagesToSave, pageUuidsToDelete, blocksToSaveByPage)
processed += chunk.size
onProgress("Indexing pages... (${processed.coerceAtMost(total)}/$total)")
}

processed += chunk.size
onProgress("Indexing pages... ($processed/$total)")
}
}
logger.info("Background indexing complete.")
Expand Down Expand Up @@ -1002,13 +1024,7 @@ class GraphLoader(
}
}

// Pre-load all existing pages in one query. Replaces one getPageByName DB call per
// file (up to 4 000 round-trips on a warm restart) with a single bulk read whose
// result is shared across all parallel chunks read-only.
val allPages = pageRepository.getAllPages().first().getOrNull() ?: emptyList()
val pagesByName = allPages.associateBy { it.name.lowercase() }
val pagesByJournalDate = allPages.filter { it.journalDate != null }
.associateBy { it.journalDate!! }
val isJournalDir = path.endsWith("/journals")

val loadedCount = coroutineScope {
var processedCount = 0
Expand All @@ -1019,6 +1035,29 @@ class GraphLoader(
async(parallelScope.coroutineContext) {
PerformanceMonitor.startTrace("processChunk")
try {
// Per-chunk bounded existence lookups (one IN query per ≤100 files)
// instead of preloading the entire pages table. The former
// getAllPages() preload materialized every Page object plus two
// full-size maps for the duration of the load — on 8 000+ page
// graphs that contributed to the Android OOM. Peak memory here is
// now O(chunk), independent of graph size.
val chunkTitles = chunk.map {
FileUtils.decodeFileName(it.fileName.stripPageExtension())
}
val pagesByName = pageRepository
.getPagesByNames(chunkTitles)
.getOrNull().orEmpty()
.associateBy { it.name.lowercase() }
val pagesByJournalDate = if (isJournalDir) {
val dates = chunkTitles.mapNotNull { JournalUtils.parseJournalDate(it) }
pageRepository.getJournalPagesByDates(dates)
.getOrNull().orEmpty()
.filter { it.journalDate != null }
.associateBy { it.journalDate!! }
} else {
emptyMap()
}

val pagesToSave = mutableListOf<Page>()
val blocksToSaveByPage = mutableMapOf<PageUuid, MutableList<Block>>()
val pageUuidsToDelete = mutableSetOf<PageUuid>()
Expand All @@ -1032,7 +1071,7 @@ class GraphLoader(
// Skip Logseq-internal file: protocol artifacts (e.g. file%3A..%2F%2F...)
if (title.startsWith("file:")) return@count false
val name = title
val isJournalFile = path.endsWith("/journals")
val isJournalFile = isJournalDir
val existingPage = if (isJournalFile) {
val journalDate = JournalUtils.parseJournalDate(title)
if (journalDate != null) pagesByJournalDate[journalDate]
Expand Down Expand Up @@ -1134,6 +1173,10 @@ class GraphLoader(
// Timeout for the batch mtime cursor on startup. Two SAF cursor queries should
// complete well under 500ms; 2s is a conservative ceiling for slow providers.
private const val SHADOW_STARTUP_TIMEOUT_MS = 2_000L

// Phase 3 drain-batch size: bounds how many unloaded Page rows are materialized at
// once during background indexing, independent of graph size.
private const val INDEX_BATCH_SIZE = 100
}

private data class ParseResult(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -376,6 +376,16 @@ object MigrationRunner {
"CREATE INDEX IF NOT EXISTS idx_measurement_annotations_image_uuid ON measurement_annotations(image_uuid)"
)
),
Migration(
name = "pages_unloaded_partial_index",
statements = listOf(
// Partial index covering only unloaded pages (is_content_loaded = 0).
// Makes selectUnloadedPagesPaginated and countUnloadedPages O(unloaded) instead
// of O(total) — on a large graph where most pages are loaded the index is small
// and both the drain-loop OFFSET scan and the COUNT(*) become index-only ops.
"CREATE INDEX IF NOT EXISTS idx_pages_unloaded ON pages(uuid) WHERE is_content_loaded = 0"
)
),
)

/**
Expand Down
Loading
Loading