Comment by irthomasthomas

1 year ago

This is great! I wish I could bring myself to blog, as I discovered this accidentally around March. I was experimenting with an agent that acted like a ghost in the machine and interacted via shell terminals. It would start every session by generating a greeting in ASCII art. On one occasion, I was shocked to see that the greeting was getting better each time it ran. When I looked into the logs, I saw that there was a mistake in my code which was causing it to always return an error message to the model, even when no error occurred. The model interpreted this as an instruction to try and improve its code.

Some more observations: New Sonnet is not universally better than Old Sonnet. I have done thousands of experiments in agentic workflows using both, and New Sonnet fails regularly at the same tasks Old Sonnet passes. For example, when asking it to update a file, Old Sonnet understands that updating a file requires first reading the file, whereas New Sonnet often overwrites the file with 'hallucinated' content.

When executing commands, Old Sonnet knows that it should wait for the execution output before responding, while New Sonnet hallucinates the command outputs.

Also, regarding temperature: 0 is not always more deterministic than temperature 1. If you regularly deal with code that includes calls to new LLMs, you will notice that, even at temperature 0, it often will 'correct' the model name to something it is more familiar with. If the subject of your prompt is newer than the model's knowledge cutoff date, then a higher temperature might be more accurate than a lower temperature.

1 comment

irthomasthomas

zahlman 1 year ago

>I wish I could bring myself to blog

As someone trying to take blogging more seriously: one thing that seems to help is to remind yourself of how sick you are of repeating yourself on forums.