feat: Implement Orama RAG pipeline with HuggingFace embeddings for semantic search

## Problem

`@orama/orama`, `@orama/plugin-embeddings`, and `@huggingface/transformers` are all installed as dependencies, but the semantic/vector search pipeline is not implemented. `src/lib/search/orama-index.ts` only sets up a basic keyword index and `searchKnowledge()` only does FTS5-style text matching. The full RAG (Retrieval-Augmented Generation) pipeline that these packages enable is completely missing.

## Current State

```
src/lib/search/
  orama-index.ts     # basic keyword schema only (~60 lines)
  progressive.ts     # text search, no vector/hybrid ranking
```

Installed but unused: `@orama/plugin-embeddings`, `@huggingface/transformers`

## Proposed Implementation

### 1. Configure Orama with embedding plugin

```ts
// src/lib/search/orama-index.ts
import { create, insert, search } from '@orama/orama';
import { pluginEmbeddings } from '@orama/plugin-embeddings';

export async function createOramaIndex() {
  const db = await create({
    schema: {
      id: 'string',
      title: 'string',
      content: 'string',
      tags: 'string[]',
      createdAt: 'number',
      embedding: 'vector[384]',  // all-MiniLM-L6-v2 output dimension
    } as const,
    plugins: [
      pluginEmbeddings({
        embeddings: {
          defaultProperty: 'embedding',
          model: {
            modelPath: 'Xenova/all-MiniLM-L6-v2',
            document: {
              indexedProperties: ['title', 'content'],
            },
            query: {
              property: 'embedding',
            },
          },
        },
      }),
    ],
  });
  return db;
}
```

### 2. Index notes with embeddings on insert/update

```ts
// src/lib/search/index-manager.ts
export async function indexNote(db: OramaDB, note: Note) {
  await insert(db, {
    id: note.id,
    title: note.title,
    content: note.content,
    tags: note.tags ?? [],
    createdAt: note.createdAt,
    // embedding generated automatically by plugin
  });
}

export async function reindexAll(db: OramaDB, notes: Note[]) {
  for (const note of notes) {
    await indexNote(db, note);
  }
}
```

### 3. Hybrid search (BM25 + cosine similarity)

```ts
// src/lib/search/progressive.ts
export async function hybridSearch(
  db: OramaDB,
  query: string,
  options: {
    limit?: number;
    hybridWeights?: { text: number; vector: number };
    tags?: string[];
  } = {}
) {
  const { limit = 10, hybridWeights = { text: 0.3, vector: 0.7 } } = options;

  const results = await search(db, {
    term: query,
    mode: 'hybrid',
    similarity: 0.7,
    limit,
    hybridWeights,
    where: options.tags?.length ? { tags: { containsAll: options.tags } } : undefined,
  });

  return results.hits.map(hit => ({
    id: hit.id,
    score: hit.score,
    document: hit.document,
  }));
}
```

### 4. RAG context builder for AI harness

```ts
// src/lib/search/rag-context.ts
export async function buildRagContext(
  db: OramaDB,
  userQuery: string,
  maxChunks = 3
): Promise<string> {
  const results = await hybridSearch(db, userQuery, { limit: maxChunks });
  if (results.length === 0) return '';
  
  const context = results
    .map((r, i) => `[${i + 1}] **${r.document.title}**\n${r.document.content.slice(0, 500)}`)
    .join('\n\n');
  
  return `## Relevant Knowledge Base Context\n\n${context}\n\n---`;
}
```

### 5. Integrate into AIHarness

Before sending user message to LLM, prepend RAG context:

```ts
const ragContext = await buildRagContext(oramaDb, userMessage);
const systemPromptWithContext = ragContext 
  ? `${baseSystemPrompt}\n\n${ragContext}`
  : baseSystemPrompt;
```

### 6. Persistence strategy

- Serialize/deserialize the Orama index to IndexedDB on app close/open
- Use `@orama/plugin-data-persistence` or custom JSON export
- Trigger re-indexing when notes are created, updated, or deleted

## Performance Notes

- `Xenova/all-MiniLM-L6-v2` runs fully in-browser via ONNX Runtime (no API calls)
- Model download ~23MB, cached in browser after first load
- Show loading indicator on first-time model initialization
- Consider lazy-loading embeddings worker via `new Worker()`

## Acceptance Criteria

- [ ] Orama index configured with `pluginEmbeddings` and `all-MiniLM-L6-v2`
- [ ] Notes indexed with embeddings on create/update/delete
- [ ] `hybridSearch()` returns ranked results by BM25 + cosine similarity
- [ ] `buildRagContext()` prepends relevant notes to AI harness system prompt
- [ ] Index persisted to IndexedDB across page reloads
- [ ] Loading indicator shown during first-time model download
- [ ] Unit tests for indexing and search with mock notes
- [ ] Search UI updated to use `hybridSearch` instead of basic keyword matching

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Implement Orama RAG pipeline with HuggingFace embeddings for semantic search #283

Problem

Current State

Proposed Implementation

1. Configure Orama with embedding plugin

2. Index notes with embeddings on insert/update

3. Hybrid search (BM25 + cosine similarity)

4. RAG context builder for AI harness

5. Integrate into AIHarness

6. Persistence strategy

Performance Notes

Acceptance Criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat: Implement Orama RAG pipeline with HuggingFace embeddings for semantic search #283

Description

Problem

Current State

Proposed Implementation

1. Configure Orama with embedding plugin

2. Index notes with embeddings on insert/update

3. Hybrid search (BM25 + cosine similarity)

4. RAG context builder for AI harness

5. Integrate into AIHarness

6. Persistence strategy

Performance Notes

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions