Comment by Madmallard

5 months ago

Is it actually good at solving complex code or is it just garbage and people are lying about it as usual?

In my experience EXTENSIVELY using claude 3.5 sonnet you basically have to do everything complex or you're just introducing massive amounts of slop code into your code base that while functional is nowhere near good. And for anything actually complex like requires a lot of context to make a decision and has to be useful to multiple different parts, it's just hopelessly bad.

I've played with it the whole day (so take it with a grain of salt). My gut feeling is that it can produce a bigger ... "thing". I am calling it a "thing", because it looks very much as what you want, but the bigger it is - the more the chances of it being subtly (or not) wrong.

I usually ask the models to extend a small parser/tree-walking interpreter with a compiler/VM.

Up until Claude 3.7 the models would propose something lazy and obviously incomplete. 3.7 generated something that looks almost right, mostly works, but is so overcomplicated and broken in such a way, that I rather delete it and write it from scratch. Trying to get the model to fix it resulted in running in circles, spitting out pieces of code that didn't fit the existing ones etc.

Not sure if I prefer the former or the latter tbh.