Problem
@orama/orama, @orama/plugin-embeddings, and @huggingface/transformers are all installed as dependencies, but the semantic/vector search pipeline is not implemented. src/lib/search/orama-index.ts only sets up a basic keyword index and searchKnowledge() only does FTS5-style text matching. The full RAG (Retrieval-Augmented Generation) pipeline that these packages enable is completely missing.
Current State
src/lib/search/
orama-index.ts # basic keyword schema only (~60 lines)
progressive.ts # text search, no vector/hybrid ranking
Installed but unused: @orama/plugin-embeddings, @huggingface/transformers
Proposed Implementation
1. Configure Orama with embedding plugin
// src/lib/search/orama-index.ts
import { create, insert, search } from '@orama/orama';
import { pluginEmbeddings } from '@orama/plugin-embeddings';
export async function createOramaIndex() {
const db = await create({
schema: {
id: 'string',
title: 'string',
content: 'string',
tags: 'string[]',
createdAt: 'number',
embedding: 'vector[384]', // all-MiniLM-L6-v2 output dimension
} as const,
plugins: [
pluginEmbeddings({
embeddings: {
defaultProperty: 'embedding',
model: {
modelPath: 'Xenova/all-MiniLM-L6-v2',
document: {
indexedProperties: ['title', 'content'],
},
query: {
property: 'embedding',
},
},
},
}),
],
});
return db;
}
2. Index notes with embeddings on insert/update
// src/lib/search/index-manager.ts
export async function indexNote(db: OramaDB, note: Note) {
await insert(db, {
id: note.id,
title: note.title,
content: note.content,
tags: note.tags ?? [],
createdAt: note.createdAt,
// embedding generated automatically by plugin
});
}
export async function reindexAll(db: OramaDB, notes: Note[]) {
for (const note of notes) {
await indexNote(db, note);
}
}
3. Hybrid search (BM25 + cosine similarity)
// src/lib/search/progressive.ts
export async function hybridSearch(
db: OramaDB,
query: string,
options: {
limit?: number;
hybridWeights?: { text: number; vector: number };
tags?: string[];
} = {}
) {
const { limit = 10, hybridWeights = { text: 0.3, vector: 0.7 } } = options;
const results = await search(db, {
term: query,
mode: 'hybrid',
similarity: 0.7,
limit,
hybridWeights,
where: options.tags?.length ? { tags: { containsAll: options.tags } } : undefined,
});
return results.hits.map(hit => ({
id: hit.id,
score: hit.score,
document: hit.document,
}));
}
4. RAG context builder for AI harness
// src/lib/search/rag-context.ts
export async function buildRagContext(
db: OramaDB,
userQuery: string,
maxChunks = 3
): Promise<string> {
const results = await hybridSearch(db, userQuery, { limit: maxChunks });
if (results.length === 0) return '';
const context = results
.map((r, i) => `[${i + 1}] **${r.document.title}**\n${r.document.content.slice(0, 500)}`)
.join('\n\n');
return `## Relevant Knowledge Base Context\n\n${context}\n\n---`;
}
5. Integrate into AIHarness
Before sending user message to LLM, prepend RAG context:
const ragContext = await buildRagContext(oramaDb, userMessage);
const systemPromptWithContext = ragContext
? `${baseSystemPrompt}\n\n${ragContext}`
: baseSystemPrompt;
6. Persistence strategy
- Serialize/deserialize the Orama index to IndexedDB on app close/open
- Use
@orama/plugin-data-persistence or custom JSON export
- Trigger re-indexing when notes are created, updated, or deleted
Performance Notes
Xenova/all-MiniLM-L6-v2 runs fully in-browser via ONNX Runtime (no API calls)
- Model download ~23MB, cached in browser after first load
- Show loading indicator on first-time model initialization
- Consider lazy-loading embeddings worker via
new Worker()
Acceptance Criteria
Problem
@orama/orama,@orama/plugin-embeddings, and@huggingface/transformersare all installed as dependencies, but the semantic/vector search pipeline is not implemented.src/lib/search/orama-index.tsonly sets up a basic keyword index andsearchKnowledge()only does FTS5-style text matching. The full RAG (Retrieval-Augmented Generation) pipeline that these packages enable is completely missing.Current State
Installed but unused:
@orama/plugin-embeddings,@huggingface/transformersProposed Implementation
1. Configure Orama with embedding plugin
2. Index notes with embeddings on insert/update
3. Hybrid search (BM25 + cosine similarity)
4. RAG context builder for AI harness
5. Integrate into AIHarness
Before sending user message to LLM, prepend RAG context:
6. Persistence strategy
@orama/plugin-data-persistenceor custom JSON exportPerformance Notes
Xenova/all-MiniLM-L6-v2runs fully in-browser via ONNX Runtime (no API calls)new Worker()Acceptance Criteria
pluginEmbeddingsandall-MiniLM-L6-v2hybridSearch()returns ranked results by BM25 + cosine similaritybuildRagContext()prepends relevant notes to AI harness system prompthybridSearchinstead of basic keyword matching