Retrieval via VectorDB — a PromptQL Retrieval Bridge

Natural-language analytics over a support-ticket corpus, where the LLM plans and a vector index retrieves. Tokens are spent on reasoning — never on reading the haystack.

Why

Agents that answer questions over operational data usually do one of two expensive things:

RAG-stuff the context — shove retrieved text into the model and hope, or
Wander the database — let the LLM probe with exploratory SQL (SELECT DISTINCT component, ILIKE '%cert%', repeat), burning a model round-trip per guess.

This project takes a third path. Hasura PromptQL turns a question into a program (a query plan). That program gets exactly one retrieval primitive — search_documents — which is guaranteed to return the right candidate rows in one call. The corpus text never passes through the model: results flow into the program's variables and artifacts, and only the top-k evidence the model needs for its narrative ever reaches the context.

The LLM decides what to look for.
The vector index decides which rows match.
SQL does the math.

Architecture

 ┌────────────────────────────┐
 │  PromptQL playground (LLM) │   writes a query plan (Python)
 └─────────────┬──────────────┘
               │ executes plan
 ┌─────────────▼──────────────┐
 │     Hasura DDN engine      │   one governed API over both sources
 └──────┬──────────────┬──────┘
        │              │
 ┌──────▼─────┐  ┌─────▼─────────────────────┐
 │ Postgres   │  │ search connector (Python) │
 │ connector  │  │   RetrievalBridge         │
 │ tickets,   │  │   vector ANN + BM25,      │
 │ accounts   │  │   reciprocal-rank fusion  │
 └──────┬─────┘  └─────┬─────────────────────┘
        │              │
 ┌──────▼──────────────▼──────┐
 │  Postgres + pgvector       │   swappable: pgvector | LanceDB | turbopuffer
 └────────────────────────────┘

The retrieval bridge

search_documents is a single hybrid-retrieval command:

Vector similarity finds tickets by meaning ("TLS problems" matches "certificate expired on edge").
BM25 keyword match catches exact tokens embeddings smear out (ERR_DIM_384, ticket ids, plan-tier names).
Reciprocal-rank fusion blends both rankings client-side, so the backend is swappable: RETRIEVAL_BRIDGE_BACKEND selects pgvector, LanceDB, or turbopuffer — PromptQL never learns which one answered.

Each hit returns flat scalar fields including the ticket id, which joins straight back to the Postgres accounts model — so a fuzzy semantic query can flow into an exact revenue aggregation in one plan.

Run locally

Prereqs: Docker, the DDN CLI (ddn auth login), and a Postgres with pgvector holding the seeded retrieval_bridge corpus (tables: tickets, accounts, documents) on localhost:5432.

cp .env.example .env        # or edit .env: JDBC URL + pgvector DSN
ddn run docker-start        # builds and starts engine, connectors, playground
ddn console --local         # opens the PromptQL chat UI

Endpoints: engine GraphQL :3280, PromptQL playground :3282, search connector :8787, pg connector :8788.

Questions to try

What are the 10 most common ticket problems I have currently?
Find past incidents that look like TLS or certificate problems.
Which enterprise accounts were affected by certificate-related incidents, and what's the total monthly revenue at risk?

That last one is the point of the project: semantic search → join to accounts → SUM(monthly_revenue), in a single deterministic plan, with the LLM never seeing more than the top-k hits.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
.devcontainer		.devcontainer
.hasura		.hasura
.vscode		.vscode
app		app
docs		docs
engine		engine
globals		globals
promptql-playground-patch		promptql-playground-patch
scripts		scripts
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
compose.yaml		compose.yaml
hasura.yaml		hasura.yaml
otel-collector-config.yaml		otel-collector-config.yaml
start-demo.sh		start-demo.sh
supergraph.yaml		supergraph.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Retrieval via VectorDB — a PromptQL Retrieval Bridge

Why

Architecture

The retrieval bridge

Run locally

Questions to try

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Retrieval via VectorDB — a PromptQL Retrieval Bridge

Why

Architecture

The retrieval bridge

Run locally

Questions to try

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages