Comment by HardCodedBias

7 months ago

I'm always amazed that such long system prompts don't degrade performance.

Openai api lets you cache the beginning parts of prompts already to save time/money so it's not parsing the same instructions repeatedly, not very different here.

  • There is "performance" as in "speed and cost" and performance as in "the model returning quality responses, without getting lost in the weeds". Caching only helps with the former.

    • "the model returning quality responses, without getting lost in the weeds"

      I should edit, but that would be disingenuous. This is exactly what I meant.

      thank you!