The vector database that thinks in graphs.
SwarnDB is a high-performance vector database written in Rust that combines HNSW and IVF+PQ indexing with a virtual graph layer and 15+ built-in vector math operations. Unlike traditional vector databases that stop at nearest-neighbor search, SwarnDB automatically computes relationships between vectors and exposes them as a traversable graph.
One engine, three capabilities: vector search, graph traversal, and vector mathematics.
- macOS Intel (x86_64) is not built by CI. Apple Silicon Macs only for the macOS wheel today. Intel-Mac users can run the manual release script on an x86_64 macOS host or wait for native support.
- Windows ARM64 is not built by CI. Windows x86_64 only for the Windows wheel today. Windows on ARM hosts can run the x86_64 wheel under Windows' built-in x86 emulation, or wait for native support.
-
Vector search + graph traversal in one engine. The virtual graph layer computes nearest-neighbor edges and threshold-filtered relationships automatically. Query vectors, then traverse their connections; no external graph database required.
-
15+ vector math operations built in. Ghost vectors, cone search, SLERP interpolation, k-means, PCA, maximal marginal relevance, centroid computation, vector drift detection, and more.
-
Billion-scale without compromise. IVF + HNSW + product quantization keeps memory bounded while maintaining high recall on datasets with hundreds of millions of vectors.
-
Rust-native performance with SIMD acceleration. AVX2, SSE4.1, NEON, and scalar fallback. Zero-copy mmap, arena allocators, DashMap lock-free concurrency, and fine-grained HNSW locking.
-
Dual API: gRPC + REST. High-throughput gRPC for production pipelines. REST for rapid prototyping, debugging, and curl-friendly workflows.
-
File-based bulk ingestion for large loads. Stage vectors as a
.npyor flat.f32file on a path the server can read, then callbulk_insert_from_path. The server reads the file via memory mapping, so working memory during the load is bounded by the index being built rather than by the input file size. -
Fast restart and transparent crash recovery. Plain HNSW collections become queryable within seconds of the server opening its ports. Multi-collection databases load collections in parallel during startup. Crash recovery via incremental delta replay or full write-ahead log replay happens automatically, and is observable through dedicated readiness, recovery, and persistence endpoints suited to Kubernetes orchestration.
Search throughput, recall, and latency on DBpedia 1M (1536 dim float32) with cosine distance and default HNSW parameters (M=16, ef_construction=200), measured on a 32-core, 64 GB host with 8 concurrent searcher threads, 1,000 queries per ef_search setting averaged across 3 iterations:
| ef_search | QPS | Recall@10 | p50 (ms) | p95 (ms) | p99 (ms) |
|---|---|---|---|---|---|
| 25 | 2,398 | 0.9816 | 3.16 | 4.91 | 6.06 |
| 50 | 2,214 | 0.9894 | 3.33 | 5.26 | 6.77 |
| 100 | 1,801 | 0.9921 | 4.16 | 6.85 | 8.02 |
| 200 | 1,233 | 0.9935 | 6.18 | 10.19 | 12.26 |
| 400 | 760 | 0.9960 | 10.00 | 16.83 | 20.48 |
| 800 | 437 | 0.9974 | 17.42 | 30.43 | 35.90 |
Reproduce with python benchmark/qps_vs_recall.py --workers 8 --n-queries 1000 --iterations 3 --ef-search-list 25,50,100,200,400,800.
For the full benchmark page (worker saturation, ingestion via bulk_insert_from_path, restart and recovery timings, memory behavior, reproduction recipes), see Benchmarks.
Pull and run from Docker Hub:
docker run -d -p 8080:8080 -p 50051:50051 sarthiai/swarndbVerify it is running:
curl http://localhost:8080/healthSee Docker Guide for persistence, configuration, and Docker Compose setup.
1. Create a collection:
curl -X POST http://localhost:8080/api/v1/collections \
-H "Content-Type: application/json" \
-d '{
"name": "articles",
"dimension": 384,
"distance_metric": "cosine"
}'2. Insert vectors:
curl -X POST http://localhost:8080/api/v1/collections/articles/vectors \
-H "Content-Type: application/json" \
-d '{
"id": 1,
"values": [0.1, 0.2, 0.3, 0.4],
"metadata": {"topic": "physics", "year": 2024}
}'3. Search:
curl -X POST http://localhost:8080/api/v1/collections/articles/search \
-H "Content-Type: application/json" \
-d '{
"query": [0.1, 0.2, 0.3, 0.4],
"k": 10
}'pip install swarndbfrom swarndb import SwarnDBClient
with SwarnDBClient(host="localhost", port=50051) as client:
# Create a collection
client.collections.create("articles", dimension=384, distance_metric="cosine")
# Insert vectors
client.vectors.insert("articles", vector=[0.1, 0.2, ...], metadata={"topic": "physics"})
client.vectors.insert("articles", vector=[0.3, 0.1, ...], metadata={"topic": "math"})
client.vectors.insert("articles", vector=[0.2, 0.4, ...], metadata={"topic": "physics"})
# Search
results = client.search.query("articles", vector=[0.1, 0.2, ...], k=10)
for r in results.results:
print(r.id, r.score) # distance score (lower = more similar)
# Graph: set a similarity threshold, then traverse relationships
client.graph.set_threshold("articles", threshold=0.85)
client.collections.optimize("articles")
edges = client.graph.get_related("articles", vector_id=1)
for edge in edges:
print(edge.target_id, edge.similarity)
# Search with graph-enriched results
results = client.search.query(
"articles",
vector=[0.1, 0.2, ...],
k=10,
include_graph=True,
graph_threshold=0.85,
)Async support is available via AsyncSwarnDBClient with the same API surface.
SwarnDB is organized as seven Rust crates with clean dependency boundaries:
| Crate | Role |
|---|---|
vf-core |
Core types, distance functions, SIMD kernels |
vf-storage |
WAL, segment management, memory-mapped I/O, collections |
vf-index |
HNSW and brute-force index implementations |
vf-query |
Filter evaluation, query execution, batch processing |
vf-quantization |
Scalar, product, and binary quantization; IVF partitioning |
vf-graph |
Virtual relationship graph, traversal algorithms |
vf-server |
gRPC and REST servers, authentication, health checks |
- Single insert for one-at-a-time writes via gRPC or REST
- Streaming bulk insert with batched gRPC streams, configurable batch lock size, write-ahead log flush interval, and optional parallel HNSW construction
- File-based bulk insert via
bulk_insert_from_path: the server reads a.npyor flat.f32file from any path it can read and ingests directly from the kernel page cache, without copying the payload through gRPC - Deferred indexing during bulk loads, finalized by a single
optimize()call that rebuilds the HNSW index and the metadata index, with the virtual graph rebuilt on the same call whenrebuild_graph=trueis passed - Bulk insert checkpoints and resume via per-batch checkpoints and an opaque
resume_tokenreturned in the bulk-insert response, so interrupted loads can pick up from the last committed batch
- Fast restart for plain HNSW collections, queryable within seconds of the server opening its ports
- Parallel collection load at startup, so a database with many collections comes up in parallel rather than serially
- Incremental delta replay or full write-ahead log replay on unclean shutdown, applied transparently before traffic resumes
- Operational endpoints for orchestration: Kubernetes-style
/healthz,/readyz,/startupz; a global/recovery_status; a per-collectionGET /api/v1/collections/{collection}/persistence_status; and Prometheus metrics at/metrics
- HNSW index with configurable
ef_construction,ef_search, andMparameters - IVF + Product Quantization for billion-scale datasets with bounded memory
- Batch search with multi-query execution and shared overhead
- Pre-filtering with adaptive index selection (B-tree, hash, bitmap) for metadata-filtered queries
- Per-query ef_search to tune recall/latency tradeoff at query time
- Automatic relationship computation from HNSW structure with configurable similarity thresholds
- Graph traversal via BFS/DFS across vector relationships for multi-hop discovery
- Threshold-based filtering with per-collection, per-query, and per-vector precedence
- Graph-enriched search where results are automatically annotated with related vectors and edge weights
- Deferred graph mode for batch inserts that defer graph computation until
optimize()is called
15+ vector math operations available through both gRPC and REST APIs:
| Operation | Description |
|---|---|
| Ghost vectors | Synthetic vectors representing absent concepts in a space |
| Cone search | Angular proximity search within a cone aperture |
| SLERP interpolation | Spherical linear interpolation between vectors |
| Centroid computation | Weighted and unweighted centroids of vector sets |
| Vector drift detection | Track how vector representations change over time |
| K-means clustering | Partition vectors into k clusters |
| PCA | Dimensionality reduction via principal component analysis |
| Analogy completion | Vector arithmetic for analogy tasks (A:B :: C:?) |
| Maximal marginal relevance | Diversity-aware result re-ranking |
| Vector normalization | L2 normalization for angular similarity |
All distance computations are SIMD-accelerated with runtime dispatch:
| Instruction Set | Platform | Width |
|---|---|---|
| AVX2 | x86_64 | 256-bit |
| SSE4.1 | x86_64 | 128-bit |
| NEON | ARM / Apple Silicon | 128-bit |
| Scalar | All platforms | Portable fallback |
Specialized kernels include fused cosine distance (dot product + norms in a single pass), batched multi-vector distance computation, and SIMD gather for PQ distance table lookups.
All configuration is via environment variables. See .env.example for the full list.
| Variable | Default | Description |
|---|---|---|
SWARNDB_HOST |
0.0.0.0 |
Bind address |
SWARNDB_GRPC_PORT |
50051 |
gRPC listener port |
SWARNDB_REST_PORT |
8080 |
REST listener port |
SWARNDB_DATA_DIR |
./data |
Data storage directory |
SWARNDB_LOG_LEVEL |
info |
Log verbosity (trace, debug, info, warn, error) |
SWARNDB_API_KEYS |
(empty) | Comma-separated API keys; empty disables auth |
SWARNDB_MAX_CONNECTIONS |
1000 |
Maximum concurrent connections |
SWARNDB_REQUEST_TIMEOUT_MS |
10000 |
Request timeout in milliseconds |
SwarnDB exposes dual API surfaces: gRPC on port 50051 and REST on port 8080.
| Operation | gRPC Service | REST Endpoint |
|---|---|---|
| Collection CRUD | CollectionService |
POST/GET/DELETE /api/v1/collections |
| Vector CRUD | VectorService |
POST/GET/DELETE /api/v1/collections/{id}/vectors |
| Search | SearchService |
POST /api/v1/collections/{id}/search |
| Batch search | SearchService |
POST /api/v1/search/batch |
| Graph operations | GraphService |
POST/GET /api/v1/collections/{id}/graph/* |
| Math operations | MathService |
POST /api/v1/collections/{id}/math/* |
| Health / Readiness | HealthService |
GET /health, GET /ready |
For complete API documentation, see API Reference.
| Guide | Description |
|---|---|
| Getting Started | Installation, first steps, basic usage |
| Core Concepts | Collections, vectors, metadata, indexing |
| API Reference | Complete gRPC and REST API documentation |
| Python SDK | SDK installation, client usage, async support |
| Virtual Graph | Graph layer concepts, traversal, thresholds |
| Vector Math | All 15+ math operations with examples |
| Docker Guide | Docker setup, persistence, Compose, and building from source |
| Configuration | Environment variables and tuning guide |
| Deployment | Docker, Kubernetes, and Helm deployment |
| Benchmarks | Reference workloads, hardware, measured numbers, reproduction recipes |
Found a bug or have a feature request? Open an issue on GitHub Issues.
SwarnDB is licensed under the Elastic License 2.0 (ELv2).
The SwarnDB project is designed, developed and maintained by Chirotpal
