Comment by garfij

7 days ago

Probably less sensible than you think. How many terms would they need to do this over? How many terms would they need to do it for _at once_? How many tokens would that add to every prompt that comes in?

Let alone that dynamically modifying the base system prompt would likely break their entire caching mechanism given that caching is based on longest prefix, and I can't imagine that the model's system prompt is somehow excluded from this.