Comment by ashvardanian

5 months ago

The question is how many, and what kind of VMs you use? It greatly affects performance :)

I run a lot of search-related benchmarks (https://github.com/ashvardanian) and curious if you’ve compared to other engines on the same hardware setup, tracing recall, NDCG, indexing, and query speeds.

1 comment

ashvardanian

talipozturk 5 months ago

We shard the data and index on about 6 x n2-standard-96 spot instances so the total cost of indexing the entire deep1b is less than $12. We are working on to make it $6. We separate indexing and query VMs. For queries we use dedicated VMs. USearch numbers look great and are better than ours if you run the query and indexing on the same VM/node. We believe design-wise distributed, task-oriented design is the right way to handle vector search for thousands of tenants with different size datasets. Data ingest is also a separate task for us so Ingest, Index and Query are all handled by different cluster of VMs.