Comment by jnnnthnn
1 year ago
I suspect this is probably because of a bit of a bias in the indexed dataset: at present, the indexed stories tend to bias toward high-scores ones, and at a glance I don't see that many stories about Postgres clustering in that distribution.
yeah there are only three stories coming up from the site search and none picks up things like citus etc
https://hn.algolia.com/?q=postgres+clustering
only one is semanthically correct, the other pick up the wrong version of clustering (i.e. k-means instead of multi master writes)
but yeah if one doesn't test the hard cases, how does one know it preserves semantics :D
In fairness, it's probably impossible to unambiguously determine what the intended/desired interpretation is (though intuitively it seems like k-means should be lower likelihood)!
I've tried Hyde and seems to work better. had to do it client side tho. asked chatgpt: "write one sentence explanation about this topic: solutions for postgres clustering" which returned "Solutions for PostgreSQL clustering involve implementing methods such as streaming replication or third-party tools like Patroni to manage and distribute database workloads across multiple servers for enhanced performance and fault tolerance." then I searched that:
https://hackersearch.net/search?q=Solutions+for+PostgreSQL+c...
and results are much better:
1. An overview of distributed Postgres architectures 2. A Technical Dive into PostgreSQL's replication mechanisms 3. Ways to capture changes in Postgres
hyde paper is here https://arxiv.org/abs/2212.10496
it's possible that openai embedding are simmetrical, if that the case you need to hallucinate some content and use that as base for the embedding distance calculation. or you can move to asymmetric embedding, or you can try prompting their embedding
edit: prompting embedding seems to work, tried searching for “write an article about: solutions for postgres clustering” and results are much better https://hackersearch.net/search?q=write+an+article+about%3A+...
you can try prepending "write an article about: " to all user searches :D
4 replies →