Comment by Areena_28

8 days ago

I know even we hit the same thing building internal security tooling. our model kept formatting output like documentation, not like how we would or any person in place of us would read in a terminal at 2am during an incident.

I am a bit curious, did you find this behavior consistent across models or is it more pronounced with certain ones?

4 comments

Areena_28

noemit 8 days ago

I ran into it while building - I should have tested different temps too - I was just trying to get cli style tool calls to be more reliable

Areena_28 5 days ago

yeah temperature is probably worth a run, we noticed even small adjustments changed how the model interpreted formatting expectations quite a bit.

stuaxo 8 days ago

Literate programming is about to become mainstream in the funniest way possible.

Areena_28 5 days ago

oh yesss, except literate programming was still the human explaining intent to other humans. this is more like the human explaining intent to a machine that then explains it back to other humans. hahaha this is actually funny.