Comment by mbowcut2

2 years ago

This is fun. It would be interesting to build a single graph of concepts that all users contribute to. Then you wouldn't have to run LLM inference on every request, just the novel ones, plus you could publish the complete graph which would be something like an embedding space.

3 comments

mbowcut2

lilyball 2 years ago

A lot of combinations return instantly, so I assume that it is in fact caching a lot.

mbowcut2 2 years ago
oh I just realized that 'isNew' in the response refers to a global set, not the user set. So, I guess it's doing exactly what I said lol.
- lilyball 2 years ago
  
  I just went back and did some new combinations with early ones and I'm still getting intermittent delays even though all early combinations must be done, so I assume part of this is just the server itself being a little overloaded and so even responses that are cached remotely but not locally may experience delays.