Comment by h4ny
6 days ago
GGP's sentiment resonates with me. I invest a fair bit of time into LLMs to keep up on how †hings are evolving and I do throw both small and large tasks at them. I'm seeing great results with some small task but with anything that is remotely close to actual engineering I just can't get satisfactory results.
My largest project is a year old, it's full-stack JavaScript, and I consciously use patterns, structures, and diligently add documentations right from the beginning for the code base to be as LLM friendly as possible.
I see great results on refactoring with limited scope, scaffolding test cases (I still choose to write my own tests but LLMs can also generate very good tests if I explicitly point to existing tests of highly related code, such as some repository methods), documenting functions, etc. but I'm just not seeing the kind of quality that people claim that LLMs can do for them on complex tasks.
I want to believe that LLMs are actually capable of doing what at least a good junior engineer can do but I'm not seeing that in my own experience. Whenever we point out these issues we are encountering, we just basically get the "git gud" response with no practical details on what we can actually dp to get the results that people claim to be getting. Then people start blaming our lack of structures, patterns, problems with our prompts, the language, our stack, etc. when we complain about the "git gud" response being too vague. Nobody claiming to be seeing great results seems to want to do a comprehensive write-up or, better still, a stream of their entire workflow to teach others how to do actual, good engineering with LLMs on real-world problems either -- they all just want to give high level details and assert success.
On top of that, the fact that none of the people I know in engineering working in both large organizations and respectable startups that are pushing AI are seeing that kind of results naturally makes me even more skeptical of claims of success. What I'm often hearing from them are mediocre engineers thinking that they are being productive but actually just offloading the work to their colleagues through review, and nobody seems to be seeing tangible returns from using AI in their workflow but people in C-suites are pushing AI anyway.
If just about anything can be "your fault", how can anyone claiming that LLMs are great for real engineering without showing evidence be so confident that what they're claiming but not showing is actually the case.
I feel like every time I comment on anything related to your blog posts I probably came across as belligerent and get down voted but I really don't intend to.
Which model and tools are you using it that repo?