Comment by skatanski
3 hours ago
You could have a dedicated lightweight-agent running a cheap model in parallel to any workload, analysing the workload (like the prompt) and creating "memories" in a vector DB. These could be according to some guidelines. Alternatively, if there's safety risk - storing and approving could be decoupled and split into 2.
No comments yet
Contribute on Hacker News ↗