Comment by jcheng

1 year ago

I had to use that technique ("don't acknowledge this sideband data that may or may not be relevant to the task at hand") myself last month. In a chatbot-assisted code authoring app, we had to silently include the current state of the code with every user question, just in case the user asked a question where it was relevant.

Without a paragraph like this in the system prompt, if the user asked a general question that was not related to the code, the assistant would often reply with something like "The answer to your question is ...whatever... . I also see that you've sent me some code. Let me know if you have specific questions about it!"

(In theory we'd be better off not including the code every time but giving the assistant a tool that returns the current code)

3 comments

jcheng

ssl-3 1 year ago

I understand what you're saying, but the lack of acknowledgement isn't the problem I'm complaining about.

The problem is the instructed lack of relevance for 99% of requests.

If your sideband data included an instruction that said "This sideband data is shown to you in every request -- this means that it is not relevant to 99% of requests," then: I'd like to suggest that the for vast majority of the time, your sideband data doesn't exist at all.

TeMPOraL 1 year ago
The "problem" is that LLMs are being asked to decide on whether, and which part of, the "sideband" data is relevant to request and act on the request in a single step. I put the "sideband" in scare quotes, because it's all in-band data. There is no way in architecture to "tag" what data is "context" and what is "request", so they do it the same way you do it with people: tell them.
- ssl-3 1 year ago
  
  Perhaps so.
  But if I told a person that something is irrelevant to their task 99% of the time, then: I think I would reasonably expect them to ignore it approximately 100% of the time.