Comment by arendtio

3 months ago

One way to mitigate the issue is to use tests or specifications and let the AI find a solution to the spec.

A few months ago, solving such a spec riddle could take a while, and most of the time, the solutions that were produced by long run times were worse than the quick solutions. However, recently the models have become significantly better at solving such riddles, making it fun (depending on how well your use case can be put into specs).

In my experience, sonnet 3.7 represented a significant step forward compared to sonnet 3.5 in this discipline, and Gemini 2.5 Pro was even more impressive. Sonnet 4 makes even fewer mistakes, but it is still necessary to guide the AI through sound software engineering practices (obtaining requirements, discovering technical solutions, designing architecture, writing user stories and specifications, and writing code) to achieve good results.

Edit: And there is another trick: Provide good examples to the AI. Recently, I wanted to create an app with the OpenAI Realtime API and at first it failed miserably, but then I added the most important two pages of the documentation and one of the demo projects into my workspace and just like that it worked (even though für my use-case the API calls had to be use quite differently).

1 comment

arendtio

fxnn 3 months ago

That's one thing where I love Golang. I just tell Aider to `/run go doc github.com/some/package`, and it includes the full signatures in the chat history.

It's true: often enough AI struggles to use libraries, and doesn't remember the usage correctly. Simply adding the go doc fixed that often.