You may know ANN Benchmarks - it’s a leaderboard of vector search algorithms. It’s referenced a lot by companies when choosing a vector system.
But let’s look at ANN Benchmarks - it measures:
- Recall
- Latency
What does it NOT measure?
- Incremental updates impact on search latency
- Sharding and replication
- Reliability
- Consistency / availability of updates
- Filtering performance
- Memory usage
- Recall on YOUR embeddings
Depending on YOUR problem, you may choose an ANN Benchmarks loser if say, you care about fast updates, performant filters, or low memory footprint.
All to say, there’s nothing wrong with ANN Benchmarks, but YOUR problem is almost certainly far more multidimensional than just maximizing recall for the latency.
When you choose a vector database - or really anything - don’t just look at the topline public benchmark. You need to discover your product requirements. Real production problems transcend a few easily benchmarkable metrics.
-Doug
This is part of Doug’s Daily Search tips - subscribe here
Enjoy softwaredoug in training course form!
Starting June 22!
I hope you join me at Cheat at Search with LLMs to learn how to apply LLMs to search applications. Check out this post for a sneak preview.