Vector search isn’t that hard - Think about maps!
Nearest neighbors in 768 dimensions is like nearest neighbors in 2 dimensions.
Then solve “Find the 10 closest addresses’s to Doug”. Really there’s two systems humans have used to organize addresses:
- Postal codes - clumping a constant set of addresses into a single grouping. Know Doug’s zip code? Well scan through the addresses in Doug’s zip code to find the nearest to him
- Streets - connecting all the addresses together in a network. Know Doug’s street? Then walk down the streets to gather the other addresses nearby
The former approximates cluster-based retrieval methods like simple IVF files or SPFresh. The latter approximates a graph like HNSW connecting a vector to its neighbors, traversing to find top N closest neighbors.
-Doug
This is part of Doug’s Daily Search tips - subscribe here
Enjoy softwaredoug in training course form!
Starting June 22!
I hope you join me at Cheat at Search with LLMs to learn how to apply LLMs to search applications. Check out this post for a sneak preview.