← Back to context

Comment by minimaxir

3 months ago

It's worth noting that one of the fixes OpenAI employed to get ChatGPT to stop being sycophantic is to simply to edit the system prompt to include the phrase "avoid ungrounded or sycophantic flattery": https://simonwillison.net/2025/Apr/29/chatgpt-sycophancy-pro...

I personally never use the ChatGPT webapp or any other chatbot webapps — instead using the APIs directly — because being able to control the system prompt is very important, as random changes can be frustrating and unpredictable.

> I personally never use the ChatGPT webapp or any other chatbot webapps — instead using the APIs directly — because being able to control the system prompt is very important, as random changes can be frustrating and unpredictable.

This assumes that API requests don't have additional system prompts attached to them.

  • Actually you can't do "system" roles at all with OpenAI models now.

    You can use the "developer" role which is above the "user" role but below "platform" in the hierarchy.

    https://cdn.openai.com/spec/model-spec-2024-05-08.html#follo...

    • They just renamed "system" to "developer" for some reason. Their API doesn't care which one you use, it'll translate to the right one. From the page you linked:

      > "developer": from the application developer (possibly OpenAI), formerly "system"

      (That said, I guess what you said about "platform" being above "system"/"developer" still holds.)

You can bypass the system prompt by using the API? I thought part of the "safety" of LLMs was implemented with the system prompt. Does that mean it's easier to get unsafe answers by using the API instead of the GUI?

Side note, I've seen a lot of "jailbreaking" (i.e. AI social engineering) to coerce OpenAI to reveal the hidden system prompts but I'd be concerned about accuracy and hallucinations. I assume that these exploits have been run across multiple sessions and different user accounts to at least reduce this.

I'm a bit skeptical of fixing the visible part of the problem and leaving only the underlying invisible problem