Comment by noemit

8 days ago

I ran into it while building - I should have tested different temps too - I was just trying to get cli style tool calls to be more reliable

yeah temperature is probably worth a run, we noticed even small adjustments changed how the model interpreted formatting expectations quite a bit.