Comment by jnnnthnn
1 year ago
Thanks for trying it out!
Agreed it could be faster for uncached queries. The embeddings retrieval itself is actually pretty fast (uses pgvector). However, I found that having a LLM rerank results + generate summaries related to the search query made results more useful, which is what accounts for much of the latency.
Maybe I should make that a user-customizable setting!
You can do all of that in a single SQL query, with pgml.embed() and then pgml.train() a custom reranker with xgboost, to pgml.predict() the conversion score of a search result based on click-through-rate, or other objective.
If you'd like free hosting, feel free to reach out. I'm one of the founders at postgresml.org.
Sweet. I'll follow-up off HN. Thank you!