← Back to context

Comment by n_ary

2 days ago

To correctly vibe code something useful, I find that I need to religiously give faith to the tool and forget all about SWE principles and best practices and instead treat it like a child who makes mistakes and corrections while the adult must not intervene or admonish too much but rather nudge it to right direction.

Also vibe code has a parallel feature, while the code is generating, you are also doing live review and correcting it towards right direction, so depending on your experience, the end product can be a bad mess or wonderful piece of creation and maintenance dream.

The issue with seasoned SWE is that, the moment a mistake(or bad pattern) is made, the baby is thrown with bath water.

For a tiered app like the one presented, 35k LOC is not really that impressive if you think about it. A generic react based front end will easily need a large number of LOC due to modular principle of components, various amounts of hooks and tests(nearly makes us 25-40% of LOC). A business layer will also have many layers of abstractions and numerous impl. to move data between layers.

The vibe code shines, when you let it build one block at a time, limit the scope well and focus. Also, 2-3 weeks is a lot of time to write 35k LOC. at start of any new project, LOC generation rate is very high. But in maintenance phase it significantly falls as smaller changes are more common.

I've had a bit more success on the front end as it is possible to see the results of a change very quickly. In fact I would prefer it to just auto-apply the changes and I would visually inspect. It isn't bad but the workflow is pretty slow. The resultant code is also very verbose - likely 3X more code than an experienced engineer would create (this is one part that I am certain will improve dramatically in time). While I do use this workflow, sometimes I feel like I am just being lazy as opposed to productive.

I'm just being honest. For my use case, I would be much better off if LLMs could just do everything.

My experience matches yours.

Lots of apps are quite repetitive: for building APIs for example you generate one controller and the ask the app to generate more using the first ones as a pattern. For frontend you do the same for forms or lists.

Tests are often quite good, but I think they were already great even back in the first ChatGPT release.

With this strategy and the fact that some patterns are quite verbose (albeit understandable for an AI or a reader), it is quite easy to get to a big LoC while still maintaining consistency.