Comment by JasonSage

5 months ago

I think it could be some of both. By giving access to the chain of thought one would able to see what the agent is correcting/adjusting for, allowing you to compile a library of vectors the agent is aware of and gaps which could be exploitable. Why expose the fact that you’re working to correct for a certain political bias and not another?