Comment by stillpointlab
2 days ago
I'm reading the docs and it does not appear Google keeps these embeddings at all. I send some text to them, they return the embedding for that text at the size I specified.
So the flow is something like:
1. Have a text doc (or library of docs)
2. Chunk it into small pieces
3. Send each chunk to <provider> and get an embedding vector of some size back
4. Use the embedding to:
4a. Semantic search / RAG: put the embeddings in a vector DB and do some similarity search on the embedding. The ultimate output is the source chunk
4b. Run a cluster algorithm on the embedding to generate some kind of graph representation of my data
4c. Run a classifier algorithm on the embedding to allow me to classify new data
5. The output of all steps in 4 is crucially text
6. Send that text to an LLM
At no point is the embedding directly in the models memory.
No comments yet
Contribute on Hacker News ↗