Comment by HardCodedBias

10 months ago

I'm always amazed that such long system prompts don't degrade performance.

4 comments

HardCodedBias

Openai api lets you cache the beginning parts of prompts already to save time/money so it's not parsing the same instructions repeatedly, not very different here.

ludwik 10 months ago
There is "performance" as in "speed and cost" and performance as in "the model returning quality responses, without getting lost in the weeds". Caching only helps with the former.
- otabdeveloper4 10 months ago
  
  If the context window is small enough then only the tail of the prompt matters anyways.
- HardCodedBias 10 months ago
  
  "the model returning quality responses, without getting lost in the weeds"
  I should edit, but that would be disingenuous. This is exactly what I meant.
  thank you!