Comment by lmeyerov
1 year ago
We generally stick with using neo4j/neptune/etc for more operational OLTP graph queries, basically large-scale managed storage for small neighborhood lookups. As soon as the task becomes more compute-tier AI workloads, like LLM summary indexing of 1M tweets or 10K documents, we prefer to use GPU-based compute stacks & external APIs with more fidelity. Think pipelines combining bulk embeddings, rich enrichment & wrangling, GNNs, community detection, etc. We only dump into DBs at the end. Speedups are generally in the 2-100X territory with even cheapo GPUs, so this ends up a big deal for both development + production. Likewise, continuous update flows end up being awkward in these environments vs full compute-tier ones, even ignoring the GPU aspect.
Separately, we're still unsure about vector search inside vs outside the graph DB during retrieval, both in the graph RAG scenario and the more general intelligence work domains. I'm more optimistic there for keeping in these graph DB, especially for small cases (< 10M node+edges) we do in notebooks.
And agreed, it's unfortunate neo4j uses graph RAG to market a variety of mostly bad quality solutions and conflate it with graph db storage, and the MSR researchers used it for a more specific and more notable technique (in AI circles) that doesn't need a graph DB and IMO, fundamentally, not even a KG. It's especially confusing that both groups are 'winning' on the term... in different circles.
No comments yet
Contribute on Hacker News ↗