← Back to context

Comment by mbowcut2

1 year ago

This is fun. It would be interesting to build a single graph of concepts that all users contribute to. Then you wouldn't have to run LLM inference on every request, just the novel ones, plus you could publish the complete graph which would be something like an embedding space.

A lot of combinations return instantly, so I assume that it is in fact caching a lot.

  • oh I just realized that 'isNew' in the response refers to a global set, not the user set. So, I guess it's doing exactly what I said lol.

    • I just went back and did some new combinations with early ones and I'm still getting intermittent delays even though all early combinations must be done, so I assume part of this is just the server itself being a little overloaded and so even responses that are cached remotely but not locally may experience delays.