Vector search isn’t that hard - Think about maps!

Nearest neighbors in 768 dimensions is like nearest neighbors in 2 dimensions.

Then solve “Find the 10 closest addresses’s to Doug”. Really there’s two systems humans have used to organize addresses:

  • Postal codes - clumping a constant set of addresses into a single grouping. Know Doug’s zip code? Well scan through the addresses in Doug’s zip code to find the nearest to him
  • Streets - connecting all the addresses together in a network. Know Doug’s street? Then walk down the streets to gather the other addresses nearby

The former approximates cluster-based retrieval methods like simple IVF files or SPFresh. The latter approximates a graph like HNSW connecting a vector to its neighbors, traversing to find top N closest neighbors.

-Doug

This is part of Doug’s Daily Search tips - subscribe here


Enjoy softwaredoug in training course form!

Starting May 18!

Signup here - http://maven.com/softwaredoug/cheat-at-search

I hope you join me at Cheat at Search with Agents to learn use agents in search. build better RAG and use LLMs in query understanding.

Doug Turnbull

More from Doug
Twitter | LinkedIn | Newsletter | Bsky