Comment by Xyra
2 hours ago
I'm using Voyage-3.5-lite at halfvec(2048), which with my limited research, seems to be one of the best embedding models. There's semi-sophisticated (breaking on paragraphs, sentences) ~300 token chunking.
When Claude is using our embed endpoint to embed arbitrary text as a search vector, it should work pretty well cross-domains. One can also use compositions of centroids (averages) of vectors in our database, as search vectors.
No comments yet
Contribute on Hacker News ↗