Skip to content

fix: search termination and result set bounded by k instead of efSearch#22

Open
suykerbuyk wants to merge 3 commits into
coder:mainfrom
suykerbuyk:fix/search-algorithm
Open

fix: search termination and result set bounded by k instead of efSearch#22
suykerbuyk wants to merge 3 commits into
coder:mainfrom
suykerbuyk:fix/search-algorithm

Conversation

@suykerbuyk
Copy link
Copy Markdown
Contributor

Summary

Two issues in layerNode.search():

  1. Termination too aggressive — stopped when no neighbor beat result.Min() (best result). Per HNSW Algorithm 5, should stop when best candidate > result.Max() (worst result). Explored too little of the graph.

  2. Result set bounded by k, not efSearch — with k=3 and efSearch=100, only 3 candidates were tracked, causing the distance threshold to converge to a local minimum within a few iterations.

Fix: bound the result set by efSearch during exploration, use the correct HNSW termination condition, and trim to k after search completes.

Depends on #20 and #21.

Test plan

  • Added TestGraph_SearchFindsCorrectNearest — verifies search returns true nearest neighbors with small k
  • go test ./... and go vet ./... pass

John Suykerbuyk added 3 commits April 11, 2026 12:26
In a binary min-heap, the last array element is not necessarily the
maximum. Max() must scan the leaf nodes (indices n/2..n-1) to find
the true maximum. PopLast() used Remove(Len()-1) which removed an
arbitrary element instead of the worst.

This caused incorrect evictions during neighbor selection and search
result trimming, degrading graph quality.
…ance

replenish() is called after neighbor eviction and node deletion to
restore connectivity. It unconditionally used CosineDistance, ignoring
the graph's configured distance function. This corrupted the graph
topology for any non-Cosine metric.

Thread DistanceFunc through replenish() and isolate() so the correct
distance function is always used.
Two issues in layerNode.search():

1. Termination was based on whether any neighbor improved result.Min()
   (the best result). Per HNSW Algorithm 5, termination should occur
   when the best remaining candidate is farther than result.Max() (the
   worst result). The old condition stopped exploration too early.

2. The result set was bounded by k (number of results to return)
   instead of efSearch (the exploration beam width). With k=3 and
   efSearch=100, only 3 candidates were tracked, causing the distance
   threshold to converge to a local minimum within a few iterations.

Fix: bound the result set by efSearch during exploration, use the
correct HNSW termination condition, and trim to k after search
completes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant