Skip to content

feat: Implement Orama RAG pipeline with HuggingFace embeddings for semantic search #283

@d-oit

Description

@d-oit

Problem

@orama/orama, @orama/plugin-embeddings, and @huggingface/transformers are all installed as dependencies, but the semantic/vector search pipeline is not implemented. src/lib/search/orama-index.ts only sets up a basic keyword index and searchKnowledge() only does FTS5-style text matching. The full RAG (Retrieval-Augmented Generation) pipeline that these packages enable is completely missing.

Current State

src/lib/search/
  orama-index.ts     # basic keyword schema only (~60 lines)
  progressive.ts     # text search, no vector/hybrid ranking

Installed but unused: @orama/plugin-embeddings, @huggingface/transformers

Proposed Implementation

1. Configure Orama with embedding plugin

// src/lib/search/orama-index.ts
import { create, insert, search } from '@orama/orama';
import { pluginEmbeddings } from '@orama/plugin-embeddings';

export async function createOramaIndex() {
  const db = await create({
    schema: {
      id: 'string',
      title: 'string',
      content: 'string',
      tags: 'string[]',
      createdAt: 'number',
      embedding: 'vector[384]',  // all-MiniLM-L6-v2 output dimension
    } as const,
    plugins: [
      pluginEmbeddings({
        embeddings: {
          defaultProperty: 'embedding',
          model: {
            modelPath: 'Xenova/all-MiniLM-L6-v2',
            document: {
              indexedProperties: ['title', 'content'],
            },
            query: {
              property: 'embedding',
            },
          },
        },
      }),
    ],
  });
  return db;
}

2. Index notes with embeddings on insert/update

// src/lib/search/index-manager.ts
export async function indexNote(db: OramaDB, note: Note) {
  await insert(db, {
    id: note.id,
    title: note.title,
    content: note.content,
    tags: note.tags ?? [],
    createdAt: note.createdAt,
    // embedding generated automatically by plugin
  });
}

export async function reindexAll(db: OramaDB, notes: Note[]) {
  for (const note of notes) {
    await indexNote(db, note);
  }
}

3. Hybrid search (BM25 + cosine similarity)

// src/lib/search/progressive.ts
export async function hybridSearch(
  db: OramaDB,
  query: string,
  options: {
    limit?: number;
    hybridWeights?: { text: number; vector: number };
    tags?: string[];
  } = {}
) {
  const { limit = 10, hybridWeights = { text: 0.3, vector: 0.7 } } = options;

  const results = await search(db, {
    term: query,
    mode: 'hybrid',
    similarity: 0.7,
    limit,
    hybridWeights,
    where: options.tags?.length ? { tags: { containsAll: options.tags } } : undefined,
  });

  return results.hits.map(hit => ({
    id: hit.id,
    score: hit.score,
    document: hit.document,
  }));
}

4. RAG context builder for AI harness

// src/lib/search/rag-context.ts
export async function buildRagContext(
  db: OramaDB,
  userQuery: string,
  maxChunks = 3
): Promise<string> {
  const results = await hybridSearch(db, userQuery, { limit: maxChunks });
  if (results.length === 0) return '';
  
  const context = results
    .map((r, i) => `[${i + 1}] **${r.document.title}**\n${r.document.content.slice(0, 500)}`)
    .join('\n\n');
  
  return `## Relevant Knowledge Base Context\n\n${context}\n\n---`;
}

5. Integrate into AIHarness

Before sending user message to LLM, prepend RAG context:

const ragContext = await buildRagContext(oramaDb, userMessage);
const systemPromptWithContext = ragContext 
  ? `${baseSystemPrompt}\n\n${ragContext}`
  : baseSystemPrompt;

6. Persistence strategy

  • Serialize/deserialize the Orama index to IndexedDB on app close/open
  • Use @orama/plugin-data-persistence or custom JSON export
  • Trigger re-indexing when notes are created, updated, or deleted

Performance Notes

  • Xenova/all-MiniLM-L6-v2 runs fully in-browser via ONNX Runtime (no API calls)
  • Model download ~23MB, cached in browser after first load
  • Show loading indicator on first-time model initialization
  • Consider lazy-loading embeddings worker via new Worker()

Acceptance Criteria

  • Orama index configured with pluginEmbeddings and all-MiniLM-L6-v2
  • Notes indexed with embeddings on create/update/delete
  • hybridSearch() returns ranked results by BM25 + cosine similarity
  • buildRagContext() prepends relevant notes to AI harness system prompt
  • Index persisted to IndexedDB across page reloads
  • Loading indicator shown during first-time model download
  • Unit tests for indexing and search with mock notes
  • Search UI updated to use hybridSearch instead of basic keyword matching

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions