Comment by Scene_Cast2

21 days ago

I tried doing some vibe coding on a greenfield project (using gemini 2.5 pro + cline). On one hand - super impressive, a major productivity booster (even compared to using a non-integrated LLM chat interface).

I noticed that LLMs need a very heavy hand in guiding the architecture, otherwise they'll add architectural tech debt. One easy example is that I noticed them breaking abstractions (putting things where they don't belong). Unfortunately, there's not that much self-retrospection on these aspects if you ask about the quality of the code or if there are any better ways of doing it. Of course, if you pick up that something is in the wrong spot and prompt better, they'll pick up on it immediately.

I also ended up blowing through $15 of LLM tokens in a single evening. (Previously, as a heavy LLM user including coding tasks, I was averaging maybe $20 a month.)

60 comments

Scene_Cast2

candiddevmike 21 days ago

> I also ended up blowing through $15 of LLM tokens in a single evening.

This is a feature, not a bug. LLMs are going to be the next "OMG my AWS bill" phenomenon.

Scene_Cast2 21 days ago
Cline very visibly displays the ongoing cost of the task. Light edits are about 10 cents, and heavy stuff can run a couple of bucks. It's just that the tab accumulates faster than I expect.
- eterm 20 days ago
  
  > Light edits are about 10 cents
  Some well-paid developers will excuse this with, "Well if it saved me 5 minutes, it's worth an order of magnitude than 10 cents".
  Which is true, however there's a big caveat: Time saved isn't time gained.
  You can "Save" 1,000 hours every night, but you don't actuall get those 1,000 hours back.
  
  11 replies →
- PretzelPirate 21 days ago
  
  > Cline very visibly displays the ongoing cost of the task
  LLMs are now being positioned as "let them work autonomously in the background" which means no one will be watching the cost in real time.
  Perhaps I can set limits on how much money each task is worth, but very few would estimate that properly.
  
  1 reply →
Cthulhu_ 20 days ago

Especially at companies (hence this github one), where the employees don't care about cost because it's the boss' credit card.
philkuz 20 days ago
I think that models are gonna commoditize, if they haven't already. The cost of switching over is rather small, especially when you have good evals on what you want done.
Also there's no way you can build a business without providing value in this space. Buyers are not that dumb.
- raincole 20 days ago
  
  They are already quite commoditized. Commoditization doesn't mean "cheap", and it doesn't mean you won't spend $15 a night like the GP did.

BeetleB 21 days ago

> I also ended up blowing through $15 of LLM tokens in a single evening.

Consider using Aider, and aggressively managing the context (via /add, /drop and /clear).

https://aider.chat/

gen220 20 days ago
I, too, recommend aider whenever these discussions crop up; it converted me from the "AI tools suck" side of this discussion to the "you're using the wrong tool" side.
I'd also recommend creating little `README`'s in your codebase that are mainly written with aider as the intended audience. In it, I'll explain architecture, what code makes (non-)sense to write in this directory, and so on. Has the side-effect of being helpful for humans, too.
Nowadays when I'm editing with aider, I'll include the project README (which contains a project overview + pointers to other README's), and whatever README is most relevant to the scope of my session. It's super productive.
I'm yet to find a model that beats the cost-effectiveness of Sonnet 3.7. I've tried the latest deepseek models, and while I love the price (nearly 50x cheaper?), it's just far too error-prone compared to Sonnet 3.7. It generates solid plans / architecture discussions, but, unlike Sonnet, the code it generates often confidently off-the-mark.
- BeetleB 20 days ago
  
  Have you tried Gemini 2.5? It's cheaper and scores higher on the Aider leaderboard.
  
  2 replies →
- mattlondon 20 days ago
  
  Why create READMEs and not just comments in the code?
  
  2 replies →
- rcarmo 20 days ago
  
  There is a better way than just READMEs: https://taoofmac.com/space/blog/2025/05/13/2230
  
  1 reply →
danenania 21 days ago

My tool Plandex[1] allows you to switch between automatic and manual context management. It can be useful to begin a task with automatic context while scoping it out and making the high level plan, then switch to the more 'aider-style' manual context management once the relevant files are clearly established.
1 - https://github.com/plandex-ai/plandex
Also, a bit more on auto vs. manual context management in the docs: https://docs.plandex.ai/core-concepts/context-management

SkyPuncher 20 days ago

I loathe using AI in a greenfield project. There are simply too many possible paths, so it seems to randomly switch between approaches.

In a brownfield code base, I can often provide it reference files to pattern match against. So much easier to get great results when it can anchor itself in the rest of your code base.

imiric 20 days ago
The trick for greenfield projects is to use it to help you design detailed specs and a tentative implementation plan. Just bounce some ideas off of it, as with a somewhat smarter rubber duck, and hone the design until you arrive at something you're happy with. Then feed the detailed implementation plan step by step to another model or session.
This is a popular workflow I first read about here[1].
This has been the most useful use case for LLMs for me. Actually getting them to implement the spec correctly is the hard part, and you'll have to take the reigns and course correct often.
[1]: https://harper.blog/2025/02/16/my-llm-codegen-workflow-atm/
- rcarmo 20 days ago
  
  Here’s my workflow, it takes that a few steps further: https://taoofmac.com/space/blog/2025/05/13/2230
  
  2 replies →
jollyllama 20 days ago

The trouble occurs when the brownfield project is crap already.

tmpz22 21 days ago

While its being touted for Greenfield projects I've notices a lot of failures when it comes to bootstrapping a stack.

For example it (Gemini 2.5) really struggles with newer ecosystem like Fastapi when wiring libraries like SQLAlchemy, Pytest, Python-playwright, etc., together.

I find more value in bootstrapping myself, and then using it to help with boiler plate once an effective safety harness is in place.

jim180 20 days ago

I've vibe coded small project as well using Claude Code. It's about visitors registration at the company. Simple project, one form, a couple of checkboxes, everything is stored in sqlite + has endpoint for getting .xlsx.

Initial cost was around $20 USD, which later grew to (mostly polishing) $40 with some manual work.

I've intentionally picked up simple stack: html+js+php.

A couple of things:

* I'd say I'm happy about the result from product's perspective * Codebase could be better, but I could not care less about in this case * By default, AI does not care about security unless I specifically tell it * Claude insisted on using old libs. When I've specifically told it to use the latest and greatest, it upgraded them but left code that works just with an old version. Also it mixed latest DaisyUI with some old version of tailwindcss :)

On one hand it was super easy and fun to do, on the other hand if I was a junior engineer, I bet it would have cost more.

jstummbillig 21 days ago

If you want to use Cline and are at all price sensitive (in these ranges) you have to do manual context management just for that reason. I find that too cumbersome and use Windsurf (currently with Gemini 2.5 pro) for that reason.

shepherdjerred 20 days ago

$15 in an evening sounds like a great deal when you consider the cost of highly-paid software engineers

echelon 20 days ago

> highly-paid software engineers
For now.
ipaddr 20 days ago
The money won't be flowing forever. This will cost you $6,000 a year.
- shepherdjerred 20 days ago
  
  A new grad at a FANG costs ~$200k-$250k a year after benefits
  
  2 replies →

falcor84 21 days ago

> LLMs need a very heavy hand in guiding the architecture, otherwise they'll add architectural tech debt

I wonder if the next phase would be the rise of (AI-driven?) "linters" that check that the implementation matches the architecture definition.

dontlikeyoueith 21 days ago

And now we've come full circle back to UML-based code generation.
Everything old is new again!

FeepingCreature 20 days ago

I think it's just that it's not end-to-end trained on architecture because the horizon is too short. It doesn't have the context length to learn the lessons that we do about good design.

akmarinov 20 days ago

> I noticed that LLMs need a very heavy hand in guiding the architecture, otherwise they'll add architectural tech debt. One easy example is that I noticed them breaking abstractions

That doesn’t matter anymore when you’re vibe coding it. No human is going to look at it anyway.

It can all be if/else on one line in one file. If it works and if the LLMs can work at, iterate and implement new business requirements, while keeping performance and security - code structure, quality and readability don’t matter one bit.

Customers don’t care about code quality and the only reason businesses used to care is to make it less money consuming to build and ship new things, so they can make more money.

theappsecguy 20 days ago
Wild take. Let’s just hand over the keys to LLMs I suppose, the fancy next token predictor is the capitan now.
- mattlondon 20 days ago
  
  Not that wild TBH.
  This is a common view, and I think will be the norm on the near-to-mid term, especially for basic CRUD apps and websites. Context windows are still too small for anything even slightly complex (I think we need to be at about 20m before we start match human levels), but we'll be there before you know it.
  Engineers will essentially become people who just guide the AIs and verify tests.
  
  1 reply →
FeepingCreature 20 days ago
LLMs need a very heavy hand in guiding the architecture because otherwise they'll code it in a way that even they can't maintain or expand.
- akmarinov 20 days ago
  
  Hook up something like Taskmaster or Shrimp, so that they can document as they go along and they can retrieve relevant context when they overflow their context to avoid this issue.
  Then as the context window increases, it’s less and less of an issue

dyauspitr 20 days ago

I don’t get it? Isn’t it just a monthly fixed subscription.

metaltyphoon 20 days ago

For now. Who is to say in 5 years where everyone makes this THE default workflow things work go up in price?
Scene_Cast2 20 days ago
Nope - I use a-la-carte pricing (through openrouter). I much prefer it over a subscription, as there are zero limits, I pay only for what I use, and there is much less of a walled garden (I can easily switch between Anthropic, Google, etc).
- dyauspitr 19 days ago
  
  I’m running o3 dozens of times a day all for the subscription price of $20. Surely this is way more cost effective.
- FeepingCreature 20 days ago
  
  Same here, same reasons!

karn97 20 days ago

Average coders, terrible engineers