Comment by ernst_klim

15 days ago

> 500k lines of code

Isn't it a simple REPL with some tools and integrations, written in a very high level language? How the hell is it so big? Is it because it's vibecoded and LLMs strive for bloat, or is it meaningful complexity?

73 comments

ernst_klim

samusiam 15 days ago

I just checked competitors' codebases:

- Opencode (anomalyco/opencode) is about 670k LOC

- Codex (openai/codex) is about 720k LOC

- Gemini (google-gemini/gemini-cli) is about 570k LOC

Claude Code's 500k LOC doesn't seem out of the ordinary.

lelanthran 15 days ago
> Claude Code's 500k LOC doesn't seem out of the ordinary.
Aren't all the other products also vibe-coded? "All vibe-coded products look like this" doesn't really seem to answer the question "Why is it so damn large?"
It's a repl, that calls out to a blackbox/endpoint for data, and does basic parsing and matching of state with specific actions.
I feel the bulk of those lines should be actions that are performed. Either this is correct or this is not:
1. If the bulk of those lines implement specific and simple actions, why is it so large compared to other software that implements single actions (coreutils, etc)
2. If the actions constitute only a small part of the codebase, wtf is the rest of it doing?
- samusiam 15 days ago
  
  You're complaining about vibe coding while also complaining about how you "feel" about the code. Do you see the irony in that?
  
  2 replies →
johnisgood 15 days ago
All of them are really, REALLY bad.
- surajrmal 15 days ago
  
  Bad by whose definition? They work really well in my experience. They aren't perfect but the amount of hand holding has gone down dramatically and you can fix any glaring problems with a code review at the end. I work on a multimillion line code base which does not use any popular frameworks and it does a great job. I may be benefiting from the fact that the codebase is open source and all models have obviously been trained on it.
  
  9 replies →

spiderfarmer 15 days ago

I don't know if you're mindlessly repeating the HN trope that JS/typescript/Electron is bad and that all bloat can easily prevented, but if you're truly interested in answers to your questions: RTFA.

carterschonwald 15 days ago

yeah its honestly full of vibe fixes to vibe hacks with no overarching desig. . some great little empirical observations though!i think the only clever bit relative to my own designs is just tracking time since last cache ht to check ttl. idk why i hadnt thought of that, but makes perfect sense

ale 15 days ago

There’s probably a subconscious incentive to make a tool that’s “complex” because the underlying LLM also is complex.

fragmede 15 days ago

How many LoC should it be, for that kind of program?

forgotpwd16 15 days ago

Other notable agents' LOC: Codex (Rust) ~519K, Gemini (TS) ~445K, OpenCode (TS) ~254K, Pi (TS) ~113K LOC. Pi's modular structure makes it simple to see where most of code is. Respectively core, unified API, coding agent CLI, TUI have ~3K, ~35K, ~60K, ~15K LOC. Interestingly, the just uploaded claw-code's Rust version is currently at only 28K.
edit: Claude is actually (TS) 395K. So Gemini is more bloat. Codex is arguable since is written in lower-level language.
ernst_klim 15 days ago

Well FFmpeg is roughly 1500k, but it's C+Asm and it's dozens of codecs and pretty complex features. SBCL is around 500k I guess.
I'm not saying that this is necessarily too much, I'm genuinely asking if this is a bloat or if it's justified.
troupo 15 days ago
It's a TUI API wrapper with a few commands bolted on.
I doubt it needs to be more than 20-50kloc.
You can create a full 3D game with a custom 3D engine in 500k lines. What the hell is Claude Code doing?
- neurostimulant 15 days ago
  
  Just check the leaked code yourself. Two biggest areas seem to be the `utils` module, which is a kitchen sink that covers a lot of functionality from sandboxing, git support, sessions, etc, and `components` module, which contains the react ui. You could certainly build a cli agent with much smaller codebase, with leaner ui code without react, but probably not with this truckload of functionality.
  
  3 replies →
- hombre_fatal 15 days ago
  
  Software doesn’t end at the 20k loc proof of concept though.
  What every developer learns during their “psh i could build that” weekendware attempt is that there is infinite polish to be had, and that their 20k loc PoC was <1% of the work.
  That said, doesn't TFA show you what they use their loc for?
  
  17 replies →
- spiderfarmer 15 days ago
  
  Comments like these remind me of the football spectators that shout "Even I could have scored that one" when they see a failed attempt.
  Sure. You could have. But you're not the one playing football in the Champions League.
  There were many roads that could have gotten you to the Champions League. But now you're in no position to judge the people who got there in the end and how they did it.
  Or you can, but whatever.
  
  8 replies →
- criley2 15 days ago
  
  Honest question: Why does it matter? They got the product shipped and got millions of paying customers and totally revolutionized their business and our industry.
  Engineers using LOC as a measure of quality is the inverse of managers using LOC as a measure of productivity.
  
  16 replies →
- gbibas 15 days ago
  
  [dead]