SwarnDB

The vector database that thinks in graphs.

What is SwarnDB

SwarnDB is a high-performance vector database written in Rust that combines HNSW and IVF+PQ indexing with a virtual graph layer and 15+ built-in vector math operations. Unlike traditional vector databases that stop at nearest-neighbor search, SwarnDB automatically computes relationships between vectors and exposes them as a traversable graph.

One engine, three capabilities: vector search, graph traversal, and vector mathematics.

macOS Intel (x86_64) is not built by CI. Apple Silicon Macs only for the macOS wheel today. Intel-Mac users can run the manual release script on an x86_64 macOS host or wait for native support.

Windows ARM64 is not built by CI. Windows x86_64 only for the Windows wheel today. Windows on ARM hosts can run the x86_64 wheel under Windows' built-in x86 emulation, or wait for native support.

Why SwarnDB

Vector search + graph traversal in one engine. The virtual graph layer computes nearest-neighbor edges and threshold-filtered relationships automatically. Query vectors, then traverse their connections; no external graph database required.
15+ vector math operations built in. Ghost vectors, cone search, SLERP interpolation, k-means, PCA, maximal marginal relevance, centroid computation, vector drift detection, and more.
Billion-scale without compromise. IVF + HNSW + product quantization keeps memory bounded while maintaining high recall on datasets with hundreds of millions of vectors.
Rust-native performance with SIMD acceleration. AVX2, SSE4.1, NEON, and scalar fallback. Zero-copy mmap, arena allocators, DashMap lock-free concurrency, and fine-grained HNSW locking.
Dual API: gRPC + REST. High-throughput gRPC for production pipelines. REST for rapid prototyping, debugging, and curl-friendly workflows.
File-based bulk ingestion for large loads. Stage vectors as a .npy or flat .f32 file on a path the server can read, then call bulk_insert_from_path. The server reads the file via memory mapping, so working memory during the load is bounded by the index being built rather than by the input file size.
Fast restart and transparent crash recovery. Plain HNSW collections become queryable within seconds of the server opening its ports. Multi-collection databases load collections in parallel during startup. Crash recovery via incremental delta replay or full write-ahead log replay happens automatically, and is observable through dedicated readiness, recovery, and persistence endpoints suited to Kubernetes orchestration.

Performance

Search throughput, recall, and latency on DBpedia 1M (1536 dim float32) with cosine distance and default HNSW parameters (M=16, ef_construction=200), measured on a 32-core, 64 GB host with 8 concurrent searcher threads, 1,000 queries per ef_search setting averaged across 3 iterations:

ef_search	QPS	Recall@10	p50 (ms)	p95 (ms)	p99 (ms)
25	2,398	0.9816	3.16	4.91	6.06
50	2,214	0.9894	3.33	5.26	6.77
100	1,801	0.9921	4.16	6.85	8.02
200	1,233	0.9935	6.18	10.19	12.26
400	760	0.9960	10.00	16.83	20.48
800	437	0.9974	17.42	30.43	35.90

Reproduce with python benchmark/qps_vs_recall.py --workers 8 --n-queries 1000 --iterations 3 --ef-search-list 25,50,100,200,400,800.

For the full benchmark page (worker saturation, ingestion via bulk_insert_from_path, restart and recovery timings, memory behavior, reproduction recipes), see Benchmarks.

Quick Start

Pull and run from Docker Hub:

docker run -d -p 8080:8080 -p 50051:50051 sarthiai/swarndb

Verify it is running:

curl http://localhost:8080/health

See Docker Guide for persistence, configuration, and Docker Compose setup.

Create Your First Collection and Search

1. Create a collection:

curl -X POST http://localhost:8080/api/v1/collections \
  -H "Content-Type: application/json" \
  -d '{
    "name": "articles",
    "dimension": 384,
    "distance_metric": "cosine"
  }'

2. Insert vectors:

curl -X POST http://localhost:8080/api/v1/collections/articles/vectors \
  -H "Content-Type: application/json" \
  -d '{
    "id": 1,
    "values": [0.1, 0.2, 0.3, 0.4],
    "metadata": {"topic": "physics", "year": 2024}
  }'

3. Search:

curl -X POST http://localhost:8080/api/v1/collections/articles/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": [0.1, 0.2, 0.3, 0.4],
    "k": 10
  }'

Python SDK

pip install swarndb

from swarndb import SwarnDBClient

with SwarnDBClient(host="localhost", port=50051) as client:
    # Create a collection
    client.collections.create("articles", dimension=384, distance_metric="cosine")

    # Insert vectors
    client.vectors.insert("articles", vector=[0.1, 0.2, ...], metadata={"topic": "physics"})
    client.vectors.insert("articles", vector=[0.3, 0.1, ...], metadata={"topic": "math"})
    client.vectors.insert("articles", vector=[0.2, 0.4, ...], metadata={"topic": "physics"})

    # Search
    results = client.search.query("articles", vector=[0.1, 0.2, ...], k=10)
    for r in results.results:
        print(r.id, r.score)  # distance score (lower = more similar)

    # Graph: set a similarity threshold, then traverse relationships
    client.graph.set_threshold("articles", threshold=0.85)
    client.collections.optimize("articles")
    edges = client.graph.get_related("articles", vector_id=1)
    for edge in edges:
        print(edge.target_id, edge.similarity)

    # Search with graph-enriched results
    results = client.search.query(
        "articles",
        vector=[0.1, 0.2, ...],
        k=10,
        include_graph=True,
        graph_threshold=0.85,
    )

Async support is available via AsyncSwarnDBClient with the same API surface.

Architecture

SwarnDB is organized as seven Rust crates with clean dependency boundaries:

Crate	Role
`vf-core`	Core types, distance functions, SIMD kernels
`vf-storage`	WAL, segment management, memory-mapped I/O, collections
`vf-index`	HNSW and brute-force index implementations
`vf-query`	Filter evaluation, query execution, batch processing
`vf-quantization`	Scalar, product, and binary quantization; IVF partitioning
`vf-graph`	Virtual relationship graph, traversal algorithms
`vf-server`	gRPC and REST servers, authentication, health checks

Key Capabilities

Ingestion

Single insert for one-at-a-time writes via gRPC or REST
Streaming bulk insert with batched gRPC streams, configurable batch lock size, write-ahead log flush interval, and optional parallel HNSW construction
File-based bulk insert via bulk_insert_from_path: the server reads a .npy or flat .f32 file from any path it can read and ingests directly from the kernel page cache, without copying the payload through gRPC
Deferred indexing during bulk loads, finalized by a single optimize() call that rebuilds the HNSW index and the metadata index, with the virtual graph rebuilt on the same call when rebuild_graph=true is passed
Bulk insert checkpoints and resume via per-batch checkpoints and an opaque resume_token returned in the bulk-insert response, so interrupted loads can pick up from the last committed batch

Restart and Recovery

Fast restart for plain HNSW collections, queryable within seconds of the server opening its ports
Parallel collection load at startup, so a database with many collections comes up in parallel rather than serially
Incremental delta replay or full write-ahead log replay on unclean shutdown, applied transparently before traffic resumes
Operational endpoints for orchestration: Kubernetes-style /healthz, /readyz, /startupz; a global /recovery_status; a per-collection GET /api/v1/collections/{collection}/persistence_status; and Prometheus metrics at /metrics

Vector Operations

HNSW index with configurable ef_construction, ef_search, and M parameters
IVF + Product Quantization for billion-scale datasets with bounded memory
Batch search with multi-query execution and shared overhead
Pre-filtering with adaptive index selection (B-tree, hash, bitmap) for metadata-filtered queries
Per-query ef_search to tune recall/latency tradeoff at query time

Virtual Graph Layer

Automatic relationship computation from HNSW structure with configurable similarity thresholds
Graph traversal via BFS/DFS across vector relationships for multi-hop discovery
Threshold-based filtering with per-collection, per-query, and per-vector precedence
Graph-enriched search where results are automatically annotated with related vectors and edge weights
Deferred graph mode for batch inserts that defer graph computation until optimize() is called

Math Engine

15+ vector math operations available through both gRPC and REST APIs:

Operation	Description
Ghost vectors	Synthetic vectors representing absent concepts in a space
Cone search	Angular proximity search within a cone aperture
SLERP interpolation	Spherical linear interpolation between vectors
Centroid computation	Weighted and unweighted centroids of vector sets
Vector drift detection	Track how vector representations change over time
K-means clustering	Partition vectors into k clusters
PCA	Dimensionality reduction via principal component analysis
Analogy completion	Vector arithmetic for analogy tasks (A:B :: C:?)
Maximal marginal relevance	Diversity-aware result re-ranking
Vector normalization	L2 normalization for angular similarity

SIMD Acceleration

All distance computations are SIMD-accelerated with runtime dispatch:

Instruction Set	Platform	Width
AVX2	x86_64	256-bit
SSE4.1	x86_64	128-bit
NEON	ARM / Apple Silicon	128-bit
Scalar	All platforms	Portable fallback

Specialized kernels include fused cosine distance (dot product + norms in a single pass), batched multi-vector distance computation, and SIMD gather for PQ distance table lookups.

Configuration

All configuration is via environment variables. See .env.example for the full list.

Variable	Default	Description
`SWARNDB_HOST`	`0.0.0.0`	Bind address
`SWARNDB_GRPC_PORT`	`50051`	gRPC listener port
`SWARNDB_REST_PORT`	`8080`	REST listener port
`SWARNDB_DATA_DIR`	`./data`	Data storage directory
`SWARNDB_LOG_LEVEL`	`info`	Log verbosity (`trace`, `debug`, `info`, `warn`, `error`)
`SWARNDB_API_KEYS`	(empty)	Comma-separated API keys; empty disables auth
`SWARNDB_MAX_CONNECTIONS`	`1000`	Maximum concurrent connections
`SWARNDB_REQUEST_TIMEOUT_MS`	`10000`	Request timeout in milliseconds

API Reference

SwarnDB exposes dual API surfaces: gRPC on port 50051 and REST on port 8080.

Operation	gRPC Service	REST Endpoint
Collection CRUD	`CollectionService`	`POST/GET/DELETE /api/v1/collections`
Vector CRUD	`VectorService`	`POST/GET/DELETE /api/v1/collections/{id}/vectors`
Search	`SearchService`	`POST /api/v1/collections/{id}/search`
Batch search	`SearchService`	`POST /api/v1/search/batch`
Graph operations	`GraphService`	`POST/GET /api/v1/collections/{id}/graph/*`
Math operations	`MathService`	`POST /api/v1/collections/{id}/math/*`
Health / Readiness	`HealthService`	`GET /health`, `GET /ready`

For complete API documentation, see API Reference.

Documentation

Guide	Description
Getting Started	Installation, first steps, basic usage
Core Concepts	Collections, vectors, metadata, indexing
API Reference	Complete gRPC and REST API documentation
Python SDK	SDK installation, client usage, async support
Virtual Graph	Graph layer concepts, traversal, thresholds
Vector Math	All 15+ math operations with examples
Docker Guide	Docker setup, persistence, Compose, and building from source
Configuration	Environment variables and tuning guide
Deployment	Docker, Kubernetes, and Helm deployment
Benchmarks	Reference workloads, hardware, measured numbers, reproduction recipes

Issues and Feedback

Found a bug or have a feature request? Open an issue on GitHub Issues.

License

SwarnDB is licensed under the Elastic License 2.0 (ELv2).

The SwarnDB project is designed, developed and maintained by Chirotpal

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github/workflows		.github/workflows
assets		assets
benchmark		benchmark
crates		crates
data/test		data/test
docs		docs
helm/swarndb		helm/swarndb
k8s		k8s
proto/swarndb/v1		proto/swarndb/v1
sdk/python		sdk/python
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
deny.toml		deny.toml
docker-compose.benchmark.yml		docker-compose.benchmark.yml
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SwarnDB

What is SwarnDB

Why SwarnDB

Performance

Quick Start

Create Your First Collection and Search

Python SDK

Architecture

Key Capabilities

Ingestion

Restart and Recovery

Vector Operations

Virtual Graph Layer

Math Engine

SIMD Acceleration

Configuration

API Reference

Documentation

Issues and Feedback

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SwarnDB

What is SwarnDB

Why SwarnDB

Performance

Quick Start

Create Your First Collection and Search

Python SDK

Architecture

Key Capabilities

Ingestion

Restart and Recovery

Vector Operations

Virtual Graph Layer

Math Engine

SIMD Acceleration

Configuration

API Reference

Documentation

Issues and Feedback

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages