Comment by frenchmajesty
4 months ago
OP here. I agree with you. For production use we use VoyageAI which is usually 2x faster than OpenAI at similar quality levels (p90 is < 200ms) but we're looking at spinning up a local embedding in our cloud environment, that would make p95 < 100ms and make cost negligible as well.
No comments yet
Contribute on Hacker News ↗