Comment by lelandfe

2 months ago

I'm a quite senior frontend using React and even I see Sonnet 4.5 struggle with basic things. Today it wrote my Zod validation incorrectly, mixing up versions, then just decided it wasn't working and attempted to replace the entire thing with a different library.

7 comments

lelandfe

baq 2 months ago

There’s little reason to use sonnet anymore. Haiku for summaries, opus for anything else. Sonnet isn’t a good model by today’s standards.

lelandfe 2 months ago

I have been chastened in the opposite direction by others. I've also subjectively really disliked Opus's speed and I've seen Opus do really silly things too. But I'll try out using it as a daily driver and see if I like it more.

subomi 2 months ago

Why do we all of a sudden hold these agents to some unrealistic high bar? Engineers write bugs all the time and write incorrect validations. But we iterate. We read the stacktrace in Sentry and realise what the hell I was thinking when I wrote that, and we fix things. If you're going to benefit from these agents, you'd need to be a bit more patient and point them correctly to your codebase.

My rule of thumb is that if you can clearly describe exactly what you want to another engineer, then you can instruct the agent to do it too.

puttycat 2 months ago
> Engineers write bugs all the time
Why do we hold calculators to such high bars? Humans make calculation mistakes all the time.
Why do we hold banking software to such high bars? People forget where they put their change all the time.
Etc etc.
- Der_Einzige 2 months ago
  
  I don't hold calculators to high bars. They think 0.1 + 0.2 = 0.30000000000000004:
  https://qntm.org/notpointthree
  
  1 reply →
lelandfe 2 months ago

my unrealistic bar lies somewhere above "pick a new library" bug resolution