← Back to context

Comment by morkalork

2 days ago

Question to other GCP users, how are you finding Google's aggressive deprecation of older embedding models? Feels like you have to pay to rerun your data through every 12 months.

You know of lots of LLM-using apps that don't need to re-run their fine tunings or embeddings because of improvements or new features at least annually? Things are moving so fast that "every 12 months" seems kinda slow...

This is precisely the risk I’ve been wondering about with vectorization. I’ve considered that an open source model might be valuable as you could always find someone to host it for you and control the deprecation rate yourself.

My costs for embedding are so small compared to inference I don't generally notice.

But am I crazy or did the pre-production version of gemini-embedding-001 have a much larger max context length?

Edit: It seems like it did? 8k -> 2k? Huge downgrade if true, I was really excited about the experimental model reaching GA before that