diff --git a/.agents/skills/constructive-blueprints/references/blueprint-definition-format.md b/.agents/skills/constructive-blueprints/references/blueprint-definition-format.md index 9e023d6..22d2e61 100644 --- a/.agents/skills/constructive-blueprints/references/blueprint-definition-format.md +++ b/.agents/skills/constructive-blueprints/references/blueprint-definition-format.md @@ -402,7 +402,7 @@ See [realtime-subscriptions.md](./realtime-subscriptions.md) for the full guide | `SearchUnified` | Orchestrates BM25 + trigram + FTS + composite field in one declaration | `source_fields` (optional, creates DataCompositeField first), `bm25` (sub-config), `trgm` (sub-config), `fts` (sub-config), `boost_recency` (optional `{"field": "updated_at"}`) | | `SearchVector` | `vector(N)` column + HNSW/IVFFlat index + stale tracking + job enqueue | `field_name` (default `'embedding'`), `dimensions` (default `768`), `index_method` (`'hnsw'`\|`'ivfflat'`), `metric` (`'cosine'`\|`'l2'`\|`'ip'`), `include_updated_at` (default `true`), `enqueue_job` (default `true`), `job_task_name` (default `'generate_embedding'`), `source_fields` (optional), `index_options` (optional), `chunks_config` (optional: `content_field_name`, `chunk_size`, `chunk_overlap`, `chunk_strategy`, `enqueue_chunking_job`, `chunking_task_name`) — see [`constructive-agents`](../../constructive-agents/SKILL.md) | | `SearchFullText` | `tsvector` column + GIN index + auto-update trigger | `field_name` (default `'search'`), `source_fields` (array of `{"field", "weight", "lang"}`), `lang_column` (optional — column name containing a `regconfig` value for dynamic per-row language stemming, e.g. `'lang_code'`), `search_score_weight` (default `1.0`) | -| `SearchBm25` | BM25 (pg_search/ParadeDB) index on existing text field | `field_name` (required — must already exist), `text_config` (default `'english'`), `search_score_weight` (default `1.0`), `k1` (optional BM25 tuning), `b` (optional BM25 tuning) | +| `SearchBm25` | BM25 (pg_textsearch) index on existing text field | `field_name` (required — must already exist), `text_config` (default `'english'`), `search_score_weight` (default `1.0`), `k1` (optional BM25 tuning), `b` (optional BM25 tuning) | | `SearchTrgm` | GIN trigram indexes on existing fields | `fields` (required, array of field names — must already exist). Sets `@trgmSearch` smart tag | | `SearchSpatial` | PostGIS `geometry`/`geography` column + GiST index | `field_name` (default `'geom'`), `geometry_type` (default `'Point'`), `srid` (default `4326`), `dimension` (default `2`), `use_geography` (default `false`), `index_method` (`'gist'`\|`'spgist'`) | | `SearchSpatialAggregate` | Materialized aggregate geometry on parent table + auto-update triggers | `field_name` (default `'geom_aggregate'`), `source_table_id` (required), `source_geom_field` (default `'geom'`), `source_fk_field` (optional), `aggregate_function` (default `'union'` — also `'collect'`, `'convex_hull'`, `'concave_hull'`), `geometry_type` (default `'MultiPolygon'`), `srid`, `dimension`, `use_geography`, `index_method` | diff --git a/.agents/skills/constructive-features/SKILL.md b/.agents/skills/constructive-features/SKILL.md index 664c3e5..69a3db0 100644 --- a/.agents/skills/constructive-features/SKILL.md +++ b/.agents/skills/constructive-features/SKILL.md @@ -140,7 +140,7 @@ When a feature is gated by a module, installing / omitting the module from a pre |---|---|---|---| | SearchUnified (orchestrated multi-algorithm) | `SearchUnified` blueprint node | — | [`constructive-agents`](../constructive-agents/SKILL.md) + [`constructive-search`](../constructive-search/SKILL.md) | | SearchFullText (tsvector + GIN) | `SearchFullText` blueprint node | — | [`constructive-platform`](../constructive-blueprints/references/blueprint-definition-format.md) | -| SearchBm25 (ParadeDB / pg_search) | `SearchBm25` blueprint node | — | [`constructive-platform`](../constructive-blueprints/references/blueprint-definition-format.md) | +| SearchBm25 (pg_textsearch) | `SearchBm25` blueprint node | — | [`constructive-platform`](../constructive-blueprints/references/blueprint-definition-format.md) | | SearchTrgm (trigram fuzzy) | `SearchTrgm` blueprint node | — | [`constructive-platform`](../constructive-blueprints/references/blueprint-definition-format.md) | | SearchVector (pgvector embeddings) | `SearchVector` blueprint node | — | [`constructive-agents`](../constructive-agents/SKILL.md) | | SearchSpatial (PostGIS geometry) | `SearchSpatial` blueprint node | — | [`constructive-platform`](../constructive-blueprints/references/blueprint-definition-format.md) | diff --git a/.agents/skills/constructive-search/SKILL.md b/.agents/skills/constructive-search/SKILL.md index d27cb8e..85cc7db 100644 --- a/.agents/skills/constructive-search/SKILL.md +++ b/.agents/skills/constructive-search/SKILL.md @@ -25,7 +25,7 @@ Use this skill when: | Strategy | Best For | Technology | Score | |----------|----------|------------|-------| | **TSVector** | Keyword search with stemming | PostgreSQL `tsvector` + GIN | Higher = better | -| **BM25** | Relevance-ranked text search | ParadeDB `pg_search` | Higher = better | +| **BM25** | Relevance-ranked text search | `pg_textsearch` (`<@>` operator) | Higher = better | | **Trigram** | Fuzzy / typo-tolerant matching | `pg_trgm` extension | Lower = better (distance) | | **pgvector** | Semantic / embedding similarity | `pgvector` HNSW | Lower = better (distance) | | **PostGIS** | Spatial / geographic search | `postgis` extension | Lower = better (distance) | @@ -88,7 +88,7 @@ For tables needing only full-text search: Unified PostGraphile v5 search plugin that consolidates all strategies into a single adapter-based architecture. Each strategy is a `SearchAdapter`: - `TsvectorAdapter` — PostgreSQL full-text search -- `Bm25Adapter` — ParadeDB BM25 ranking +- `Bm25Adapter` — pg_textsearch BM25 ranking - `TrgmAdapter` — pg_trgm fuzzy matching - `PgvectorAdapter` — HNSW vector similarity - `PostgisAdapter` — spatial distance queries diff --git a/features.md b/features.md index dc2c83e..cccbdbe 100644 --- a/features.md +++ b/features.md @@ -271,11 +271,11 @@ Six search strategies, from keyword matching to semantic vector similarity, unif | Strategy | Technology | Best For | |----------|------------|----------| | **Full-text (tsvector)** | PostgreSQL `tsvector` + GIN index | Keyword search with language-aware stemming | -| **BM25** | ParadeDB `pg_search` extension | Relevance-ranked full-text retrieval | +| **BM25** | `pg_textsearch` extension (BM25 scoring via `<@>` operator) | Relevance-ranked full-text retrieval | | **Trigram** | `pg_trgm` extension + GIN index | Fuzzy matching, typo tolerance, autocomplete | | **Vector (pgvector)** | `pgvector` extension + HNSW index | Semantic similarity, embeddings, RAG | | **Spatial (PostGIS)** | `postgis` extension + GiST index | Geographic proximity, geofencing, spatial containment | -| **Unified** | Composite of all above | Fan-out a single query across multiple algorithms with normalized scoring | +| **Unified** | Composite of all above | Fan-out a single query across multiple algorithms with RRF (Reciprocal Rank Fusion) scoring | ### Vector Search Details @@ -300,7 +300,7 @@ Custom PostgreSQL text search configurations are also supported. ### Unified Search -`SearchUnified` orchestrates multiple algorithms in a single declaration — embedding + BM25 + optional full-text + optional trigram. Results are normalized to a 0–1 `searchScore` and accessible via a single `unifiedSearch` filter. +`SearchUnified` orchestrates multiple algorithms in a single declaration — embedding + BM25 + optional full-text + optional trigram. Results are fused via Reciprocal Rank Fusion (RRF) — rank-based scoring that handles incompatible score scales (e.g. BM25 unbounded negatives vs tsvector [0,1]) by comparing rank positions, not raw scores. The composite `searchScore` (0–1) and `unifiedSearch` filter provide a single API for cross-algorithm search. ---