Comment by jihadjihad
2 days ago
> Garman is also not keen on another idea about AI – measuring its value by what percentage of code it contributes at an organization.
You really want to believe, maybe even need to believe, that anyone who comes up with this idea in their head has never written a single line of code in their life.
It is on its face absurd. And yet I don't doubt for a second that Garman et al. have to fend off legions of hacks who froth at the mouth over this kind of thing.
Time to apply the best analogy I've ever heard.
> "Measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs." -- Bill Gates
Do we reward the employee who has added the most weight? Do we celebrate when the AI has added a lot of weight?
At first, it seems like, no, we shouldn't, but actually, it depends. If a person or AI is adding a lot of weight, but it is really important weight, like the engines or the main structure of the plane, then yeah, even though it adds a lot of weight, it's still doing genuinely impressive work. A heavy airplane is more impressive than a light weight one (usually).
I just can’t resist myself when airplanes come up in discussion.
I completely understand your analogy and you are right. However just to nitpick, it is actually super important to have a weight on the airplane at the right place. You have to make sure that your aeroplane does not become tail heavy or it is not recoverable from a stall. Also a heavier aeroplane, within its gross weight, is actually safer as the safe manoeuverable speed increases with weight.
I think this makes the analogy even more apt.
If someone adds more code to the wrong places for the sake of adding more code, the software may not be recoverable for future changes or from bugs. You also often need to add code in the right places for robustness.
> a heavier aeroplane … is actually safer
Just to nitpick your nitpick, that’s only true up to a point, and the range of safe weights isn’t all that big really - max payload on most planes is a fraction of the empty weight. And planes can be overweight, reducing weight is a good thing and perhaps needed far more often than adding weight is needed. The point of the analogy was that over a certain weight, the plane doesn’t fly at all. If progress on a plane is safety, stability, or speed, we can measure those things directly. If weight distribution is important to those, that’s great we can measure weight and distribution in service of stability, but weight isn’t the primary thing we use.
Like with airplane weight, you absolutely need some code to get something done, and sometimes more is better. But is more better as a rule? Absolutely not.
right, thats why its a great analogy - because you also need to have at least some code in a successful piece of software. But simply measuring by the amount of code leads to weird and perverse incentives - code added without thought is not good, and too much code can itself be a problem. Of course, the literal balancing aspect isn't as important.
This is a pretty narrow take on aviation safety. A heavier airplane has a higher stall speed, more energy for the brakes to dissipate, longer takeoff/landing distances, a worse climb rate… I’ll happily sacrifice maneuvering speed for better takeoff/landing/climb performance.
2 replies →
> the safe manoeuverable speed increases with weight
The reason this is true is because at a higher weight, you'll stall at max deflection before you can put enough stress on the airframe to be a problem. That is to say, at a given speed a heavier airplane will fall out of the air [hyperbole, it will merely stall - significantly reduced lift] before it can rip the wings/elevator off [hyperbole - damage the airframe]. That makes it questionable whether heavier is safer - just changes the failure mode.
3 replies →
Progress on airplanes is often tracked by # of engineering drawings released, which means that 1000s of little clips, brackets, fittings, etc. can sometimes misrepresent the amount of engineering work that has taken place compared to preparing a giant monolithic bulkhead or spar for release. I have actually proposed measuring progress by part weight instead of count to my PMs for this reason
> the best analogy I've ever heard.
It’s an analogy that gets the job done and is targeted at non-tech managers.
It’s not perfect. Dead code has no “weight” unless you’re in a heavily storage-constrained environment. But 10,000 unnecessary rivets has an effect on the airplane everywhere, all the time.
> Dead code has no “weight”
Assuming it is truly dead and not executable (which someone would have to verify is & remains the case), dead code exerts a pressure on every human engineer who has to read (around) it, determine that it is still dead, etc. It also creates risk that it will be inadvertently activated and create e.g. security exposure.
4 replies →
In this analogy, I'd say dead code corresponds to airplane parts that aren't actually installed on the aircraft. When people talk about the folly of measuring productivity in lines of code, they aren't referring to the uselessness of dead code, they're referring to the harms that come from live code that's way bigger than it needs to be.
When you are thinking of development and refactoring, dead code absolutely has weight.
This reminds me of a piece on folklore.org by Andy Hertzfeld[0], regarding Bill Atkinson. A "KPI" was introduced at Apple in which engineers were required to report how many lines of code they had written over the week. Bill (allegedly) claimed "-2000" (a completely, astonishingly negative report), and supposedly the managers reconsidered the validity of the "KPI" and stopped using it.
I don't know how true this is in fact, but I do know how true this is in my work - you cannot apply some arbitrary "make the number bigger" goal to everything and expect it to improve anything. It feels a bit weird seeing "write more lines of code" becoming a key metric again. It never worked, and is damn-near provably never going to work. The value of source code is not in any way tied to its quantity, but value still proves hard to quantify, 40 years later.
0. https://www.folklore.org/Negative_2000_Lines_Of_Code.html
Goodhart's law: when a measure becomes a target, it ceases to be a good measure.
Given the way that a lot of AI coding actually works, it’s like asking what percent of code was written by hitting tab to autocomplete (intellisense) or what percent of a document benefited from spellcheck.
While most of us know the next word guessing is how it works in reality…
That sentiment ignores the magic of how well this works. There are mind blowing moments using AI coding, to pretend that it’s “just auto correct and tab complete” is just as deceiving as “you can vibe code complete programs”.
All that said, I'm very keen on companies telling me how much of their codebase was written by AI.
I just won't use that information in quite the excitable, optimistic way they offer it.
I want to have the model re-write patent applications, and if any portion of your patent filing was replicated by it your patent is denied as obvious and derivative.
"...just raised a $20M Series B and are looking to expand the team and products offered. We are fully bought-in to generative AI — over 40% of our codebase is built and maintained by AI, and we expect this number to continue to grow as the tech evolves and the space matures."
"What does your availability over the next couple of weeks look like to chat about this opportunity?"
"Yeah, quite busy over the next couple of weeks actually… the next couple of decades, really - awful how quickly time fills by itself these days, right? I'd have contributed towards lowering that 40% number which seems contrary to your goals anyway. But here's my card, should you need help with debugging something tricky some time in the near future and nobody manages to figure it out internally. I may be able to make room for you if you can afford it. I might be VERY busy though."
Something I wonder about the percent of code - I remember like 5-10 years ago there was a series of articles about Google generating a lot of their code programmatically, I wonder if they just adapted their code gen to AI.
I bet Google has a lot of tools to say convert a library from one language to another or generate a library based on an API spec. The 30% of code these LLMs are supposedly writing is probably in this camp, not net novel new features.
Is that why gmail loads so slowly these days
When I see these stats, I think of all the ways "percentage of code" could be defined.
I ask an AI 4 times to write a method for me. After it keeps failing, I just write it myself. AI wrote 80% of the code!
It is a really attractive idea for lazy people who don’t want to learn things
It is like measuring company output based on stuff done through codegen...