One of my best commits was removing about 60K lines of code, a whole "server" (it was early 2000's) with that had to hold all of its state in memory and replacing them with about 5k of logic that was lightweight enough to piggyback into another service and had no in-memory state at all. That was pure a algorithmic win - figuring out that a specific guided subgraph isomorphism where the target was a tree (directed, non cyclic graph with a single root) was possible by a single walk through the origin (general) directed bi-graph while emitting vertices and edges to the output graph (tree) and maintaining only a small in-process peek-able stack of steps taken from the root that can affect the current generation step (not necessarily just parent path).
I still remember the behemoth of a commit that was "-60,000 (or similar) lines of code". Best commit I ever pushed.
Those were fun times. Hadn't done anything algorithmically impressive since.
I’m a hobby programmer and lucky enough to script a lot of things at work. I consider myself fairly adept at some parts of programming, but comments like these make it so clear to me that I have an absolutely massive universe of unknowns that I’m not sure I have enough of a lifetime left to learn about.
I want to believe a lot of these algorithms will "come to you" if you're ever in a similar situation; only later will you learn that they have a name, or there's books written about it, etc.
But a lot is opportunity. Like, I had the opportunity to work on an old PHP backend, 500ms - 1 second response times (thanks in part to it writing everything to a giant XML string which was then parsed and converted to a JSON blob before being sent back over the line). Simply rewriting it in naive / best practices Go changed response times to 10 ms. In hindsight the project was far too big to rewrite on my own and I should have spent six months to a year trying to optimize and refactor it, but, hindsight.
Read some good books on data structures and algorithms, and you'll be catching up with this sort of comment in no time. And then realise there will always be a universe of unknowns to you. :-) Good luck, and keep going.
(More than?) half of the difficulty comes from the vocabulary. It’s very much a shibboleth—learn to talk the talk and people will assume you are a genius who walks the walk.
A lot if it is just technical jargon. Which doesn't mean it's bad, one has to have a way to talk about things, but the underlying logic, I've found, is usually graspable for most people.
It's the difference between hearing a lecture from a "bad" professor in Uni and watching a lecture video by Feynman, where he tries to get rid of scientific terms, when explaining things in simple terms to the public.
As long as you get a definition for your terms, things are manageable.
You could've figured out this one with basic familiarity with how graphs are represented, constructed, and navigated, and just working through it.
One way to often arrive at it is to just draw some graphs, on paper/whiteboard, and manually step through examples, pointing with your finger/pen, drawing changes, and sometimes drawing a table. You'll get a better idea of what has to happen, and what the opportunities are.
This sounds "Then draw the rest of the owl," but it can work, once you get immersed.
Then code it up. And when you spot a clever opportunity, and find the right language to document your solution, it can sound like a brilliant insight that you could just pull out of the air, because you are so knowledgeable and smart in general. When you actually had to work through that specific problem, to the point you understood it, like Feynman would want you to.
I think Feynman would tell us to work through problems. And that Feynman would really f-ing hate Leetcode performance art interviews (like he was dismayed when he found students who'd rote-memorize the things to say). Don't let Leetcode asshattery make you think you're "not good at" algorithms.
I deleted an entire micro service of task runners and replaced it with a library that uses setTimeout as the primitive driving tasks from our main server.
It’s because every task was doing a database call but they had a whole repo and aws lambdas for running it. Stupidest thing I’ve ever seen.
> I deleted an entire micro service of task runners and replaced it with a library that uses setTimeout as the primitive driving tasks from our main server.
Your example raises some serious red flags. Did it ever dawned upon you that the reason these background tasks were offloaded to a dedicated service might have been to shed this load from your main server and protect it from handling sudden peaks in demand?
If you flatten both of your trees/graphs and regard the output as strings of nodes, you reduce your task to a substring search.
Now if you want to verify if the structures and not just the leave nodes are identical, you might be able to encode structure information into you strings.
Hi I'm a mathematician with a background in graph theory and algorithms. I'm trying to find a job outside academia. Can you elaborate on the kind of work you were doing? Sounds like I could fruitfully apply my skills to something like that. Cheers!
Look into quantitative analyst roles at finance firms if you’re that smart.
There’s also a role called being an algorithms engineer in standard tech companies (typically for lower level work like networking, embedded systems, graphics, or embedded systems) but the lack of an engineering background may hamstring you there. Engineers working in crypto also use a fair bit of algorithms knowledge.
I do low level work at a top company, and you only use algorithms knowledge on the job a couple of times a year at best.
You can try to get a job at an investment bank, if you're okay with heavy slogging, i.e., in terms of hours, which I have heard is the case, although that could be wrong.
I heard from someone who was in that field, that the main qualification for such a job is analytical ability and mathematics knowledge, apart from programming skills, of course.
That was about 20 years ago. Not much translates to today's world. I was in the algorithms team working on a CMDB product. Great tech, terrible marketing.
These days it's very different, mostly large-ish distributed systems.
I would love a little more context on this, cause it sounds super interesting and I also have zero clue what you’re talking about. But translating a stateful program into a stateless one sounds like absolute magic that I would love to know about
He has two graphs. He wants to determine if one graph is a subset of another graph.
The graph that is to be determined as a subset is a tree. From there he says it can be done in an algorithm that only traverses every node at most one time.
I’m assuming he’s also given a starting node in the original graph and the algorithm just traverses both graphs at the same time starting from the given start node in the original graph and the root in the tree to see if they match? Standard DFS or BFS works here.
I may be mistaken. Because I don’t see any other way to do it in one walk through unless you are given a starting node in the original graph but I could be mistaken.
To your other point, The algorithm inherently has to also be statefull. All traversal algorithms for graphs have to have long term state. Simply because if your at a node in a graph and it has like 40 paths to other places you can literally only go down one path at a time and you have to statefully remember that node has another 39 paths that you have to come back to later.
The target being a tree is irrelevant right? It’s the “guided” part that makes a single walk through possible?
You are starting at a specific node in the graph and saying that if there’s an isomorphism the target tree root node must be equivalent to that specific starting node in the original graph.
You just walk through the original graph following the pattern of the target tree and if something doesn’t match it’s false otherwise true? Am I mistaken here? Again the target being a tree is a bit irrelevant. This will work for any subgraph as long as as you are also given starting point nodes for both the target and the original graph?
I've worked on a product that reinvented parts of the standard library in confusing and unexpected ways, meaning that a lot of the code could easily be compacted 10-50 times in many place, i.e. 20-50 lines could be turned into 1-5 or so. I argued for doing this and deleting a lot of the code base, which didn't take hold before me and every other dev left except one. Nine months after that they had deleted half the code base out of necessity, roughly 2 MLOC to 1 MLOC, because most of it wasn't actually used much by the customers and the lone developer just couldn't manage the mess on his own.
In college I worked for a company whose goal was to prove that their management techniques could get a bunch of freshman to write quality code.
They couldn't. I would go find the code that caused a bug, fix it and discover that the bug was still there. Because previous students had, rather than add a parameter to a function, would make a copy and slightly modify it.
I deleted about 3/4 of their code base (thousands of lines of Turbo Pascal) that fall.
Bonus: the customer was the Department of Energy, and the program managed nuclear material inventory. Sleep tight.
In addition to not breaking existing code, also has added benefit of boosting personal contribution metrics in eyes of management. Oh and it's really easy to revert things - all I have to do is find the latest copy and delete it. It'll work great, promise.
I work with someone who has a habit of code duplication like this. Typically it’s an effort to turn around something quickly for someone who is demanding and loud. Refactoring the shared function to support the end edge case would take more time and testing, so he doesn’t do it. This is a symptom of the core problem.
I have a habit of doing this for data processing code (python, polars).
For other code it's an absolute stink and i agree. But for data transforms... I've seen the alternative, a neatly abstracted in-house library of abstracted combinations of dataframe operations with different parameters and.. It's the most aesthetically pleasing unfathomable hell I've ever experienced.
So now, when munging dataframes, i will be much faster to reach for 'copy that function and modify it slightly' - maintenance headache, but at least the result is readable.
But it's a false premise; the claim is that just copy/pasting something is faster, but is it really?
The demanding / loud person can and should be ignored; as a developer, you are responsible for code quality and maintainability, not your / their manager.
> I work with someone who has a habit of code duplication like this.
Are you sure it's code duplication?
I mean, read your own description: the new function does not need to support edge cases. Having to handle edge cases is a huge code smell, and a clear sign of premature generalization.
And you even admit the guy was more productive and added less bugs?
There is a reason why the mistakes caused by naive approaches to Don't Repeat Yourself (DRY) are corrected with Write Everything Twice (WET).
This reminds me of my experience. I've worked for one company based in SEA that had almost identical portals in several countries in the region. Portals were developed by an Australian company and I was hired to maintain existing/develop new portals.
Source code for each portal was stored in a separate Git repository. I've asked the original authors how am I supposed to fix bugs that affect all the portals or develop new functionality for all the portals. The answer was to backport all fixes manually to all copies of the source code.
Then I've asked: isn't it possible to use a single source repository and use feature flags to customize appearance and features of each portals. Original authors said that it is impossible.
In 2-3 months I've merged the code of 4-5 portals into one repository, added feature flags, upgraded the framework version, release went flawlessly, and it was possible to fix a bug simultaneously for all the portals or develop a new functionality available across all the countries where the company operated. It was a huge relief for me as copying bugfixes manually was tedious and error-prone process.
I once had to deal with some contractors that habitually did this, when confronted on how this could lead to confusion they said "that's what Ctrl+F is for."
Oh boy! This reminded me of one of my worst tech leads. He pushed secret tokens to github. When I asked in the team meeting why would we do this instead of using secrets manager, the response was: "These are private respos. Also we signed an NDA before joining the company"
> Bonus: the customer was the Department of Energy, and the program managed nuclear material inventory. Sleep tight.
These are my favorite (in a sense) programmer stories--that there's these incomprehensible piles of rubbish that somehow, like, run The World and things, and yet somehow things manage to work (in an outwardly observable sense).
Although, I recall two somewhat recent stories where this wasn't the case. The unemployment benefits fiascos during early Covid-era, and some more recent air traffic control-related things (one which effected me personally).
Negative 2000 Lines of Code (1982) - https://news.ycombinator.com/newsfaq.html).In addition to it being fun to revisit perennials sometimes (though not too often), this is also a way for newer cohorts to encounter the classics for the first time—an important function of this site!
I am a simple man
I see -2k lines of code, I upvote
I've told this story to every client who tried schemes to benchmark productivity by some single-axis metric. The fact that it was Atkinson demonstrates that real productivity is only benchmarkable by utility, and if you can get a truly accurate quantification for that then you're on the shortlist for a Nobel in economics.
Important enough to re-state whenever it arises - once you have 2 or more axes/dimensions, you no longer have a linear ordering. You need to map back to a number line to "compare". This is the motivation or driving force toward your "single axis". { That doesn't mean it's a goal any easier to realize, though. I am attempting to merely clarify/amplify rather than dispute here.. }
I figured that articles like folklore are like an amusing movie file (say someone chopping a skin of a watermelon) that's repeatedly being passed around reddit.
An old Dilbert cartoon had the pointy haired boss declare monetary rewards for every fixed bug in their product. Wally went back to his desk murmuring "today I'm going to code me a minivan!"
it's just a stand-in for "expensive but relatable purchase". He's saying "I'm about to write so many bugs that the sum reward will be in the tens of thousands"
I've become something of the guy that's the main code remover at my current job. Part of it is because I've been here the longest on the team, so I've got both the knowledge and the confidence to say a feature is dead and we can get rid of it. But also part of it is just being the one to go in and clean up things like release flags after they've gone live in prod.
I'm trying to socialize my team to get more in the habit of this, but it's been hard. It's not so much that I get pushback, it's just that tasks like "clean up the feature flag" get thrown into the tech debt pile. From my perspective, that's feature work, it just happens to take place after the feature goes live instead of before. But it's work that we committed to when we decided to build the feature, so no, you don't get to put it on the tech debt board like it was some unexpected issue that came up during development.
Curious to hear other perspectives here, I do worry that I'm a bit too dogmatic about this sometimes. Part of it maybe comes from working in shared art / maker spaces a lot in the past, where "clean up your shit" was rule #1, and I kind of see developers leaving unused code throughout the codebase for features they owned through the same lens.
I probably spend 30% of time on refactoring. Deduplicating common things different people have done, adding seperating layers between old shitty code and the fancy new abstractions, adding friction to some areas to discourage crossing module boundaries, that sort of thing.
For some reason new devs keep telling me how easy it is to implement features.
Really wonder why that is. The managers keep telling me that refactoring is a nice-to-have thing and not necessary and maybe we have time next sprint.
You just have to do it without telling anyone, it improves velocity for everyone. It's architecture work on the small scale.
On days I write code, I try to do one "cleanup" PR a day just to get myself warmed up. Sometimes it is removing a feature flag, sometimes it is rewriting a file to use some new standards like a better logger library or test pattern. None of this is ticketed work, and if something takes longer than ten minutes or so I drop it and work on whatever I was going to work on originally. Make (trivial) cleanups a fun treat and a break from real work and it is easier to get other people excited about them.
Of course, lately anything trivial I ask codex to do - but there is still fun in figuring out what trivial thing I should have it take on next.
Cleanup doesn't get me a raise or promoted. In a world with constant threats of layoffs, cleanup may even be penalized depending on what's rewarded. "Clean up your shit" doesn't work when my job is on the line.
It needs to be rewarded properly to be prioritized.
Cleaning up of feature flags was something that I excelled at failing to do. If you are the one cleaning them up, then you sir deserve a raise. Don't question it. It's a service.
Well, we prioritize amongst the tech debt on that board and then move it onto the main board for sprint, it's not like it's a completely separate process. Things do go there to die sometimes though.
This is a good example[1] at 64k LOC removal. We removed built-in support for C# + WinRT interop on Windows and instead required users to use a source-generation tool (which is still the case today). This was a breaking change. We realized we had one chance to do this and took it.
Microsoft, the number being 30%; whether that's accurate is another matter. Twenty years ago people already used IDEs to generate boilerplate code (remember Java's getters/setters/hashCode/toString?) because some guy in a book said you had to.
About 1.5 years ago I inherited a project with ~ 250,000 lines of code - in just the web UI (not counting back end).
The developer who wrote it was a smart guy, but he had never worked on any other JS project. All state was stored in the DOM in custom attributes, .addEventListeners EVERYWHERE... I joke that it was as if you took a monk, gave him a book about javascript, and then locked him in a cell for 10 years.
I started refactoring pieces into web components, and after about 6 months had removed 50k lines of code. Now knowing enough about the app, I started a complete rewrite. The rewrite is about 80% feature parity, and is around 17k lines of code (not counting libraries like Vue/pinia/etc).
So, soon, I shall have removed over 200,000 loc from the project. I feel like then I should retire as I will never top that.
> The rewrite is about 80% feature parity, and is around 17k lines of code (not counting libraries like Vue/pinia/etc).
This is exactly where these comparisons break down. Obviously you don't need as much code to get passable implementations of a fraction of all the features.
It's definitely a good argument for not reinventing the wheel though.
I'd rather have 250,000 lines of code but 230,000 of that is in battle tested libraries. And of which only 20,000 lines are what we ever need to read/write.
I mean, you can get basic implementations of Vue and state management libs in a few hundred (maybe thousand?) LOCs (lots of examples on the interweb) that are probably less "toyish" than whatever this person had handrolled
> I joke that it was as if you took a monk, gave him a book about javascript, and then locked him in a cell for 10 years.
I've had a similar experience (see other comment), the original author was a junior developer at best, but unfortunately, a middle-aged, experienced developer, one of the founders of the company, and very productive. But obviously, not someone who had ever worked in a team or who had someone else work on their codebase.
Think functions thousands of lines long, nested switch/case/if/else/ternary things ten levels deep, concatenated SQL queries (it was PHP because of course), concatenated JS/HTML/HTML-with-JS (it was Dojo front-end), no automated tests of any sort, etc.
A long time ago I was working in a big project where the PLs came up with the most horrible metric I've ever seen. They made a big handwritten list, visible for the whole team, where they marked for each individual developer how many bugs they had fixed and how many bugs they had caused.
I couldn't believe my eyes. I was working in my own project beside this team with the list, so thankfully I was left out of the whole disaster.
A guy I knew wasn't that lucky. I saw how he suffered from this harmful list. Then I told him a story about the Danish film director Lars von Trier I recently had heard. von Trier was going to be chosen to appear in a "canon" list of important Danish artists that the goverment was responsible for. He then made a short film where he took the Danish flag (red with a white cross) and cut out the white lines and stitched it together again, forming a red communist flag. von Trier was immediately made persona non grata and removed from the "canon".
Later that day my friend approached the bugs caused/fixed list, cut out his own line, taped it together and put it on the wall again. I never forget how a PL came in the room later, stood and gazed at the list for a long time before he realized what had happened. "Did you do this?" he asked my friend. "Yes", he answered. "Why?", said the PL. "I don't want to be part of that list", he answered. The next day the list was gone.
The danish flag is a white cross on a red background. If you cut out the white cross, you will be left with four rectangles of red, which can be pushed together and sewn up again, forming a solid red flag
In the days when perl was the language of choice for the web I got a 97% reduction in code size. I was asked to join a late project to speed it up. (Yes I know that has low success rate).
The lead dev was a hard core c programmer and had no perl experience before this job. He handed me a 200 line uncommented function that he wrote and was not working. It was a pattern matcher. I replaced it with 6 lines of commented perl with regex that was very readable (for a regex).
Since he had no idiomatic understanding of perl he did not accept it and complained to management. We had to bring in the local perl demigod to arbitrate(at 21 was half my age at the time, but smart as a whip). Ruled in my favor and the lead was pissed.
Doing regex in C back in the day was not very common and far from idiomatic, unlike perl where its basically expected that you cram regexes in anywhere you can.
Before a recent annual performance review, I looked over my stats in the company monolith repo and found out I was net-negative on lines of code. That comes mostly from removing auto-generated API code and types (the company had moved to a new API and one of my projects was removing v1) but it was quite funny to think about going to work every day to erase code.
In an old ops role we had a metric called ticket touches. One of my workmates had almost double of everyone else but only for that metric. We had a look and it was due to how he wrote notes in tickets instead of putting all his findings in a comment he would do it incrementally as he went along. Neither of these ways were wrong it just inflated that stat for him.
I think I have mentioned this before in HN too. I am not from CS background and just learnt the trade as I was doing the job, I mean even the normal stuff.
We have a project that tries reify live objects into human readable form. Final representation is so complicated with lot of types and the initial representation is less complicated.
In order to make it readable, if there is any common or similar data nodes, we have to compare and try to combine them i.e. find places that can be made into methods and find the relevant arguments for all the calls (kind of).
Initial implementation did the transformation into the final form first, and then started the comparison. So, the comparison have to deal with all the different combinations of the types we have in final representation now, which made the whole thing so complex and has been maintained by generation of engineers that nobody had clear idea how it was working.
Then, I read about hashmap implementation later (yep, I am that dumb) and it was a revelation. So, we did following things:
1. We created a hash for skeleton that has to remain the same through all the set of comparisons and transformation of the "common nodes", (it can be considered as something similar to methods or arguments) and doing the comparison for nodes with matching skeletal hashes and
2. created a separate layer that does the comparison and creating common nodes on initial primitive form and then doing the transformation as the second layer (so you don't have to deal with all types in final representation) and
3. Don't type. Yes. Data is simplest abstraction and if your logic can made into data or some properties, please do yourself a favor and make them so. We found lot of places, where weird class hierarchies can be converted into data properties.
Basically, it is a dumb multi pass decompiler.
That did not just speed up the process, but resulted in much more readable and understandable abstractions and code. I do not know, if this is widely useful but it helped in one project. There is no silver bullet, but types were actual problem for us and so we solved it this way.
Great collection of stories! Thanks for sharing. I got carried away across the pages and relished the quotes page.
The ideals probably worked for that time and that place. Many places in other parts of the world and at other times, would have different ideals, to deal with different priorities at that time and place. America in the 80's had no survival struggle, wars, cultural stigmas, pandemics or famines. Literacy and business were blooming. Great minds and workers were lured with great promises. A natural result is accelerated innovation. Plenty of food and materials. Individualism, fun and luxury was the goal for most. The businesses delivered all of it. Personal computing was an exact fit for that business.
I am currently working on a piece of code which I am actively trying to simplify and make smaller. It's not a piece of code which has any business ever getting larger or having more features. The design is such that it is feature complete until a whole system redesign is required at which point the code would be itself wholesale replaced. So I am sitting here trying to codegolf the code down in complexity and size. It's not just important to keep it simple, it's also important to keep it small as this bit of code is, as part of this solution, going to be executed using python -c. All the while not taking the piss and making it unreadable (I can use a minifier for that).
It being 1982 and a story about the lead developer of LisaGraf, working on that very thing, it is certainly unlikely to be a GUI form.
But block mode terminals that did forms had been a thing for over a decade at that point. Not that this was likely at Apple. But there are definitely contemporary ways in which one could have been entering this stuff via a computer.
Indeed, an IBM 3270 could be told that a field was numeric. This wouldn't have the terminal prevent negative numbers. The host would have to have done that upon ENTER. But the idea of unsigned numbers in form data had been around in (say) COBOL PIC strings since the 1960s.
> Bill Atkinson, the author of Quickdraw and the main user interface designer, who was by far the most important Lisa implementer...
> I'm not sure how the managers reacted to that, but I do know that after a couple more weeks, they stopped asking Bill to fill out the form, and he gladly complied.
Notice that it doesn't say "they stopped using the form" but "they stopped asking Bill to fill out the form". The rules are different at the top, they probably still used it to mis-manage junior employees who didn't have as much influence.
I think lines of code could be an interesting and valuable metric.
If the lower (negative) score, the better (given a fixed set of features).
“Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.”
― Antoine de Saint-Exupéry, Airman's Odyssey
That just encourages bad behaviour in the other direction though. A massive multi-level nested ternary on one line is usually going to be worse than a longer but clearer set of conditions. Trying to make code brief can be good, but it can often result in an unmaintainable and hard to read mess.
Halstead complexity ([0]) might interest you. It's a semi-secret sauce based on the count of operators and operands that tells you if your function is "shitty".
You'll never get it right though. Focusing on "features per LOC" means people will write shorthand, convoluted code, pick obscure, compact languages, etc. To use an adage, if a metric becomes a target, it stops being a useful metric.
I often have a mental picture of the thing I need, I start writing it, get a bit "stuck" on architecture and think I could be using a ready made library for this. I find one or a few of them, look at the code (which is obviously more generic) and realize it's many times as large as I thought the entire project should be. With few exceptions the train of thought doesn't even reach the "Do I want to carry this around and baby sit it?" stage. Some how this continues to surprise me every time.
We had a VP recommend "lines of code" as part of performance system. I showed some CVS stats of some of our key developers; one of our best performers was at -20K lines of code over 5 years.
Would be really interesting if microservices were limited to a manageable amount of code. I wonder if technical leadership has ever guided their org devs that services shouldn't be more than ___ lines of code (excluding tests). Obviously, more services would mean more latency, but I'm pretty sure this one I'm working on now that is over 11k LOC (ignoring comments) could be replicated with a few bash commands.
One of the criticisms of microservices is that factoring a system correctly is already a hard problem, and introducing a network call between them makes it even harder.
Enforcing service LoC limits is equivalent to forcing further factoring of a system, which might not be necessary, especially not into a microservice arch.
Sometimes code is tightly coupled because it needs to be tightly coupled.
How long would a quicksort (say, of integers) be in 68000 assembly? Maybe 20 lines? My 68000 isn't very good. The real advantage of writing it in Haskell is that it's automatically applicable to anything that's Ord, that is, ordered.
When using Claude for production code in my core code base, I'm much more picky about _how_ it's written (as opposed to one-off or peripheral code that just needs to work). I find much of my work involves deleting Claude's code and simplifying its logic. (I'm still not sure I'm saving any time in the end vs writing such code from scratch.)
> I'm not sure how the managers reacted to that, but I do know that after a couple more weeks, they stopped asking Bill to fill out the form, and he gladly complied
It would be a much better story if they stopped asking everyone to fill out the form.
I did this the other day in an old component that had been refactored several times. It's incredibly satisfying. IMO removing code is often more important than adding it because it helps so much with maintainability and future development speed.
yesterday I deleted 2500 lines out of a project of 4300 lines lol.
I have mixed feelings because now it's so much simpler, but the frustration of having to write these lines in the first place, it's so annoying. that's what happens when specs aren't clear
Well, like everything it's probably a balance to strike. Otherwise you may end up with highly golfed IOCCC-style code in production, which I would definitely not recommend.
Yes, my thinking here isn't about the semantics of code, whether it is written in dense or terse form, but about the more meta level - code itself is something we write to achieve some goal, it shouldn't be the goal by itself. As an example, as a company is audited for their valuation, it is not the code amount itself that is valuable. In many ways it can be seen as a burden, as you'll need to maintain code over time.
Just because lines of code is being reported it didn't mean that bigger automatically means better. It does tell a story about how one is spending their time though.
This is one of those stories that I am sure has happened, but when it comes to "and then they never asked him again le XD face" it's clearly just made up.
Bill Atkinson recently died and there’s a great HN discussion about him. He had a good relationship with Steve Jobs; it’s reasonable to assume it’s true that he got left alone, especially if Andy Hertzfeld is the person making the assertion.
1. The site is called folklore.org. You’re sort of saying the site is true to its name.
2. It’s a direct recollection from someone who was there, not an unnamed “my cousin’s best friend” or literal folklore that is passed down by oral tradition. Andy knew Bill and was there. There is no clear motivation to tell a fictional story when there were so many real ones.
3. The specifics line up very well with what we know about Bill Atkinson and some his wizardry needed to make the Mac work.
Given this, it’s much easier to assume that your assertion is what is made up.
We had free soft drinks in the fridges at one place I worked. Cost-cutting measures were coming and I sent an email to all of engineering (including the VP) asking who wanted to join me in a shopping trip at 10AM to restock the fridge. In the email, I estimated that it would take between 60 and 90 minutes. Two carfuls of engineers left at 10AM sharp and returned a little before noon and restocked the fridges.
That was the first and last time we had to do it, as the soft drinks returned the following week.
It was Bill fucking Atkinson. Not a disposable random contractor you hire by the dozen when you need to build more CRUD APIs.
At that time at Apple, even as an IC, Bill had lines of communication to Steve and was extremely valued. There's absolutely no doubt he could get "middle manager shenanigans" gone simply by not complying or "maliciously complying". Hell, I've seen ICs far less valuable, or even close to negative value get away with stunts far worse than these, succeed and keep their jobs. Out of all the stories in Folklore.org, this is the one you have an issue with?!
Most people know who Bill Atkinson on this forum. The story premise that he wrote negative code isn't my gripe, I am sure it happened.
The outcome where all of a sudden leadership just shit its pants and doesn't communicate at all and never followed up... It's like writing "and then everyone clapped" for programmers.
I've been playing with Claude 4 Sonnet in VS Code and found it quite good. As part of the development plan, on its own it had included an optimization phase where it profiled my go code, identified some hot spots, and proposed ways to optimize it. For the most critical area, it suggested using a prefix tree which it wrote on the spot, adapted to my code, wrote benchmarks to compare the old and new version (5x improvement). Then it wrote specific tests to make sure both versions behaved the same. Then it made sure all my other tests passed and wrote a report on everything.
There were three performance optimizations in total, one which I rejected because the gain was minimal for typical use case and there are still some memory allocation optimization which I have deferred with because I'm in the middle of a major refactor of the code. The LLM has already written down plans to restart this process later when I more time.
I've been watching my colleagues' adoption of Copilot with interest. From what I can tell, the people who are the most convinced that it improves their productivity have an understanding of developer productivity that is very much in line with that of the managers in this story.
Recently I refactored about 8,000 lines of vibe-coded bloat down into about 40 lines that ran ten times as fast, required 1/20 as much memory, and eliminated both the defect I was tasked with resolving and several others that I found along the way. (Tangentially, LLM-generated unit tests never cease to amaze me.) The PHBs didn't particularly appreciate my efforts, either. We've got a very expensive Copilot Enterprise license to continue justifying.
There will be vibe and amateur banged out hustle trash, which will be the cheap plastic cutlery of the software world.
There will be lovingly hand crafted by experts code (possibly using some AI but in the hands of someone who knows their shit) that will be like the fine stuff and will cost many times more.
A lot of stuff will get prototyped as crap and then if it gets traction reimplemented with quality.
I don’t believe your numbers unless your colleagues are exceptionally bad programmers.
I’m using AI a lot too. I don’t accept all the changes if they look bad. I also keep things concise. I’ve never seen it generate something so bad I could delete 99 percent of it.
Every now and then, in between reasonable and almost-reasonable suggestions, Copilot will suggest a pile of code, stylistically consistent with the function I’m editing, the extends clear off the bottom of the page. I haven’t been inspired to hit tab a couple times and try to reverse engineer the resulting vomit of code, but I can easily imagine a new programmer accepting the code because AI! or, perhaps worse, hitting tab without even noticing.
One of my best commits was removing about 60K lines of code, a whole "server" (it was early 2000's) with that had to hold all of its state in memory and replacing them with about 5k of logic that was lightweight enough to piggyback into another service and had no in-memory state at all. That was pure a algorithmic win - figuring out that a specific guided subgraph isomorphism where the target was a tree (directed, non cyclic graph with a single root) was possible by a single walk through the origin (general) directed bi-graph while emitting vertices and edges to the output graph (tree) and maintaining only a small in-process peek-able stack of steps taken from the root that can affect the current generation step (not necessarily just parent path).
I still remember the behemoth of a commit that was "-60,000 (or similar) lines of code". Best commit I ever pushed.
Those were fun times. Hadn't done anything algorithmically impressive since.
I’m a hobby programmer and lucky enough to script a lot of things at work. I consider myself fairly adept at some parts of programming, but comments like these make it so clear to me that I have an absolutely massive universe of unknowns that I’m not sure I have enough of a lifetime left to learn about.
I want to believe a lot of these algorithms will "come to you" if you're ever in a similar situation; only later will you learn that they have a name, or there's books written about it, etc.
But a lot is opportunity. Like, I had the opportunity to work on an old PHP backend, 500ms - 1 second response times (thanks in part to it writing everything to a giant XML string which was then parsed and converted to a JSON blob before being sent back over the line). Simply rewriting it in naive / best practices Go changed response times to 10 ms. In hindsight the project was far too big to rewrite on my own and I should have spent six months to a year trying to optimize and refactor it, but, hindsight.
5 replies →
Read some good books on data structures and algorithms, and you'll be catching up with this sort of comment in no time. And then realise there will always be a universe of unknowns to you. :-) Good luck, and keep going.
6 replies →
(More than?) half of the difficulty comes from the vocabulary. It’s very much a shibboleth—learn to talk the talk and people will assume you are a genius who walks the walk.
1 reply →
A lot if it is just technical jargon. Which doesn't mean it's bad, one has to have a way to talk about things, but the underlying logic, I've found, is usually graspable for most people.
It's the difference between hearing a lecture from a "bad" professor in Uni and watching a lecture video by Feynman, where he tries to get rid of scientific terms, when explaining things in simple terms to the public.
As long as you get a definition for your terms, things are manageable.
I've been coding for a living for 10 years and that comment threw me for a loop as well. Gotta get to studying some graph theory I guess?
it’s just graph theory nomenclature. if you study an intro to graph algorithms it would get you most of the way there.
You could've figured out this one with basic familiarity with how graphs are represented, constructed, and navigated, and just working through it.
One way to often arrive at it is to just draw some graphs, on paper/whiteboard, and manually step through examples, pointing with your finger/pen, drawing changes, and sometimes drawing a table. You'll get a better idea of what has to happen, and what the opportunities are.
This sounds "Then draw the rest of the owl," but it can work, once you get immersed.
Then code it up. And when you spot a clever opportunity, and find the right language to document your solution, it can sound like a brilliant insight that you could just pull out of the air, because you are so knowledgeable and smart in general. When you actually had to work through that specific problem, to the point you understood it, like Feynman would want you to.
I think Feynman would tell us to work through problems. And that Feynman would really f-ing hate Leetcode performance art interviews (like he was dismayed when he found students who'd rote-memorize the things to say). Don't let Leetcode asshattery make you think you're "not good at" algorithms.
2 replies →
I guess you're the reason we get asked all those "Invert a binary tree" type questions these days!
Jokes aside, could I get a layman's explanation of the graph theory stuff here? Sounds pretty cool but the terminology escapes me
I deleted an entire micro service of task runners and replaced it with a library that uses setTimeout as the primitive driving tasks from our main server.
It’s because every task was doing a database call but they had a whole repo and aws lambdas for running it. Stupidest thing I’ve ever seen.
> I deleted an entire micro service of task runners and replaced it with a library that uses setTimeout as the primitive driving tasks from our main server.
Your example raises some serious red flags. Did it ever dawned upon you that the reason these background tasks were offloaded to a dedicated service might have been to shed this load from your main server and protect it from handling sudden peaks in demand?
1 reply →
Am I mistaken? Is what you say even possible?
Given two graphs one is a tree you cannot determine if the tree is a subgraph of the other graph in one walk through?
It’s only possible if you’re given additional information? Like a starting node to search from? I’m genuinely confused?
Take a look at Carl Hewitt's Same-Fringe solution, which flattens structures concurrently and compares the final (aka leave) nodes:
http://www.nsl.com/papers/samefringe.htm
If you flatten both of your trees/graphs and regard the output as strings of nodes, you reduce your task to a substring search.
Now if you want to verify if the structures and not just the leave nodes are identical, you might be able to encode structure information into you strings.
1 reply →
I oversimplified. See https://news.ycombinator.com/item?id=44390701
Hi I'm a mathematician with a background in graph theory and algorithms. I'm trying to find a job outside academia. Can you elaborate on the kind of work you were doing? Sounds like I could fruitfully apply my skills to something like that. Cheers!
Look into quantitative analyst roles at finance firms if you’re that smart.
There’s also a role called being an algorithms engineer in standard tech companies (typically for lower level work like networking, embedded systems, graphics, or embedded systems) but the lack of an engineering background may hamstring you there. Engineers working in crypto also use a fair bit of algorithms knowledge.
I do low level work at a top company, and you only use algorithms knowledge on the job a couple of times a year at best.
You can try to get a job at an investment bank, if you're okay with heavy slogging, i.e., in terms of hours, which I have heard is the case, although that could be wrong.
I heard from someone who was in that field, that the main qualification for such a job is analytical ability and mathematics knowledge, apart from programming skills, of course.
That was about 20 years ago. Not much translates to today's world. I was in the algorithms team working on a CMDB product. Great tech, terrible marketing.
These days it's very different, mostly large-ish distributed systems.
1 reply →
I would love a little more context on this, cause it sounds super interesting and I also have zero clue what you’re talking about. But translating a stateful program into a stateless one sounds like absolute magic that I would love to know about
He has two graphs. He wants to determine if one graph is a subset of another graph.
The graph that is to be determined as a subset is a tree. From there he says it can be done in an algorithm that only traverses every node at most one time.
I’m assuming he’s also given a starting node in the original graph and the algorithm just traverses both graphs at the same time starting from the given start node in the original graph and the root in the tree to see if they match? Standard DFS or BFS works here.
I may be mistaken. Because I don’t see any other way to do it in one walk through unless you are given a starting node in the original graph but I could be mistaken.
To your other point, The algorithm inherently has to also be statefull. All traversal algorithms for graphs have to have long term state. Simply because if your at a node in a graph and it has like 40 paths to other places you can literally only go down one path at a time and you have to statefully remember that node has another 39 paths that you have to come back to later.
1 reply →
The target being a tree is irrelevant right? It’s the “guided” part that makes a single walk through possible?
You are starting at a specific node in the graph and saying that if there’s an isomorphism the target tree root node must be equivalent to that specific starting node in the original graph.
You just walk through the original graph following the pattern of the target tree and if something doesn’t match it’s false otherwise true? Am I mistaken here? Again the target being a tree is a bit irrelevant. This will work for any subgraph as long as as you are also given starting point nodes for both the target and the original graph?
I oversimplified. See https://news.ycombinator.com/item?id=44390701
Nice when you turn an entire server into a library/executable.
>Those were fun times. Hadn't done anything algorithmically impressive since.
the select-a-bunch-of-code-and-then-zap-it-with-the-Del-key is the best hardware algorithm.
Sounds interesting. Have you written about it in more detail somewhere?
See https://news.ycombinator.com/item?id=44390701
What did the software product do?
The product was a CMDB, with great tech and terrible marketing.
I'm sure with impending tide of slop-code, we'll have many more things to delete in our lifetimes.
[flagged]
I'm sick and tired of all these AI generated comments. Oh you got the AI to use lower case! Wow it still writes the exact same way.
6 replies →
On a medium sized system that isn't young and fresh deleting 60 KLOC is highly unlikely to reflect a "system rethink".
Is this, from elsewhere in the thread, a system rethink, https://github.com/dotnet/runtime/pull/36715/files ?
I've worked on a product that reinvented parts of the standard library in confusing and unexpected ways, meaning that a lot of the code could easily be compacted 10-50 times in many place, i.e. 20-50 lines could be turned into 1-5 or so. I argued for doing this and deleting a lot of the code base, which didn't take hold before me and every other dev left except one. Nine months after that they had deleted half the code base out of necessity, roughly 2 MLOC to 1 MLOC, because most of it wasn't actually used much by the customers and the lone developer just couldn't manage the mess on his own.
I wouldn't call that a system rethink.
In college I worked for a company whose goal was to prove that their management techniques could get a bunch of freshman to write quality code.
They couldn't. I would go find the code that caused a bug, fix it and discover that the bug was still there. Because previous students had, rather than add a parameter to a function, would make a copy and slightly modify it.
I deleted about 3/4 of their code base (thousands of lines of Turbo Pascal) that fall.
Bonus: the customer was the Department of Energy, and the program managed nuclear material inventory. Sleep tight.
> make a copy and slightly modify it
In addition to not breaking existing code, also has added benefit of boosting personal contribution metrics in eyes of management. Oh and it's really easy to revert things - all I have to do is find the latest copy and delete it. It'll work great, promise.
I mean…when you have a pile of spaghetti, there is only so much you can do.
7 replies →
Immutable functions! I guess that’s one way of doing functional programming /s
5 replies →
I work with someone who has a habit of code duplication like this. Typically it’s an effort to turn around something quickly for someone who is demanding and loud. Refactoring the shared function to support the end edge case would take more time and testing, so he doesn’t do it. This is a symptom of the core problem.
I've been getting stricter about not letting that stuff into the codebase. They always say they'll clean it up later but they never do.
4 replies →
I have a habit of doing this for data processing code (python, polars).
For other code it's an absolute stink and i agree. But for data transforms... I've seen the alternative, a neatly abstracted in-house library of abstracted combinations of dataframe operations with different parameters and.. It's the most aesthetically pleasing unfathomable hell I've ever experienced.
So now, when munging dataframes, i will be much faster to reach for 'copy that function and modify it slightly' - maintenance headache, but at least the result is readable.
But it's a false premise; the claim is that just copy/pasting something is faster, but is it really?
The demanding / loud person can and should be ignored; as a developer, you are responsible for code quality and maintainability, not your / their manager.
1 reply →
> I work with someone who has a habit of code duplication like this.
Are you sure it's code duplication?
I mean, read your own description: the new function does not need to support edge cases. Having to handle edge cases is a huge code smell, and a clear sign of premature generalization.
And you even admit the guy was more productive and added less bugs?
There is a reason why the mistakes caused by naive approaches to Don't Repeat Yourself (DRY) are corrected with Write Everything Twice (WET).
1 reply →
This reminds me of my experience. I've worked for one company based in SEA that had almost identical portals in several countries in the region. Portals were developed by an Australian company and I was hired to maintain existing/develop new portals.
Source code for each portal was stored in a separate Git repository. I've asked the original authors how am I supposed to fix bugs that affect all the portals or develop new functionality for all the portals. The answer was to backport all fixes manually to all copies of the source code.
Then I've asked: isn't it possible to use a single source repository and use feature flags to customize appearance and features of each portals. Original authors said that it is impossible.
In 2-3 months I've merged the code of 4-5 portals into one repository, added feature flags, upgraded the framework version, release went flawlessly, and it was possible to fix a bug simultaneously for all the portals or develop a new functionality available across all the countries where the company operated. It was a huge relief for me as copying bugfixes manually was tedious and error-prone process.
I once had to deal with some contractors that habitually did this, when confronted on how this could lead to confusion they said "that's what Ctrl+F is for."
Oh boy! This reminded me of one of my worst tech leads. He pushed secret tokens to github. When I asked in the team meeting why would we do this instead of using secrets manager, the response was: "These are private respos. Also we signed an NDA before joining the company"
Was this in Blacksburg by any chance?
It was indeed! Back in the late 80s. You know of it?
It was so long ago it feels half mythical to me.
> Bonus: the customer was the Department of Energy, and the program managed nuclear material inventory. Sleep tight.
These are my favorite (in a sense) programmer stories--that there's these incomprehensible piles of rubbish that somehow, like, run The World and things, and yet somehow things manage to work (in an outwardly observable sense).
Although, I recall two somewhat recent stories where this wasn't the case. The unemployment benefits fiascos during early Covid-era, and some more recent air traffic control-related things (one which effected me personally).
[dead]
Related. Others?
Negative 2000 Lines of Code (1982) - https://news.ycombinator.com/newsfaq.html).In addition to it being fun to revisit perennials sometimes (though not too often), this is also a way for newer cohorts to encounter the classics for the first time—an important function of this site!
I am a simple man I see -2k lines of code, I upvote
I've told this story to every client who tried schemes to benchmark productivity by some single-axis metric. The fact that it was Atkinson demonstrates that real productivity is only benchmarkable by utility, and if you can get a truly accurate quantification for that then you're on the shortlist for a Nobel in economics.
Important enough to re-state whenever it arises - once you have 2 or more axes/dimensions, you no longer have a linear ordering. You need to map back to a number line to "compare". This is the motivation or driving force toward your "single axis". { That doesn't mean it's a goal any easier to realize, though. I am attempting to merely clarify/amplify rather than dispute here.. }
This story is particularly relevant now, as Bill passed away 3 weeks ago. There was a post about this on the front page at the time:
Bill Atkinson has died - https://news.ycombinator.com/item?id=44210606 - June 7, 2025 (277 comments)
I didn't see that post, but I'm glad we're able to remember Bill through humorous anecdotes and eternally relevant lessons like this.
I figured that articles like folklore are like an amusing movie file (say someone chopping a skin of a watermelon) that's repeatedly being passed around reddit.
An old Dilbert cartoon had the pointy haired boss declare monetary rewards for every fixed bug in their product. Wally went back to his desk murmuring "today I'm going to code me a minivan!"
https://i.imgur.com/tyXXh1d.png
My manager has it pinned on the breakroom wall.
The Perverse incentive: https://en.wikipedia.org/wiki/Perverse_incentive
Now I'm wondering if this story[0] I read long ago is just a written form of the comic, or if any company actually tried this.
[0]: https://thedailywtf.com/articles/The-Defect-Black-Market
Goodhart's law - When a measure becomes a target, it ceases to be a good measure
Sorry what's the minivan reference?
it's just a stand-in for "expensive but relatable purchase". He's saying "I'm about to write so many bugs that the sum reward will be in the tens of thousands"
1 reply →
I assume a cycle of write bug -> fix bug -> get paid until they can afford a new car!
It would have been a sports car but Wally’s not the type.
1 reply →
I've become something of the guy that's the main code remover at my current job. Part of it is because I've been here the longest on the team, so I've got both the knowledge and the confidence to say a feature is dead and we can get rid of it. But also part of it is just being the one to go in and clean up things like release flags after they've gone live in prod.
I'm trying to socialize my team to get more in the habit of this, but it's been hard. It's not so much that I get pushback, it's just that tasks like "clean up the feature flag" get thrown into the tech debt pile. From my perspective, that's feature work, it just happens to take place after the feature goes live instead of before. But it's work that we committed to when we decided to build the feature, so no, you don't get to put it on the tech debt board like it was some unexpected issue that came up during development.
Curious to hear other perspectives here, I do worry that I'm a bit too dogmatic about this sometimes. Part of it maybe comes from working in shared art / maker spaces a lot in the past, where "clean up your shit" was rule #1, and I kind of see developers leaving unused code throughout the codebase for features they owned through the same lens.
I probably spend 30% of time on refactoring. Deduplicating common things different people have done, adding seperating layers between old shitty code and the fancy new abstractions, adding friction to some areas to discourage crossing module boundaries, that sort of thing.
For some reason new devs keep telling me how easy it is to implement features.
Really wonder why that is. The managers keep telling me that refactoring is a nice-to-have thing and not necessary and maybe we have time next sprint.
You just have to do it without telling anyone, it improves velocity for everyone. It's architecture work on the small scale.
On days I write code, I try to do one "cleanup" PR a day just to get myself warmed up. Sometimes it is removing a feature flag, sometimes it is rewriting a file to use some new standards like a better logger library or test pattern. None of this is ticketed work, and if something takes longer than ten minutes or so I drop it and work on whatever I was going to work on originally. Make (trivial) cleanups a fun treat and a break from real work and it is easier to get other people excited about them.
Of course, lately anything trivial I ask codex to do - but there is still fun in figuring out what trivial thing I should have it take on next.
Cleanup doesn't get me a raise or promoted. In a world with constant threats of layoffs, cleanup may even be penalized depending on what's rewarded. "Clean up your shit" doesn't work when my job is on the line.
It needs to be rewarded properly to be prioritized.
> I do worry that I'm a bit too dogmatic about this sometimes
I haven't seen a lot of other good suggestions for how to accomplish this, so maybe you're being just the right amount of dogmatic.
Cleaning up of feature flags was something that I excelled at failing to do. If you are the one cleaning them up, then you sir deserve a raise. Don't question it. It's a service.
> the tech debt board
Taking you to literally mean you have a separate board for tech debt, that's your problem right there.
Well, we prioritize amongst the tech debt on that board and then move it onto the main board for sprint, it's not like it's a completely separate process. Things do go there to die sometimes though.
This is a good example[1] at 64k LOC removal. We removed built-in support for C# + WinRT interop on Windows and instead required users to use a source-generation tool (which is still the case today). This was a breaking change. We realized we had one chance to do this and took it.
[1] https://github.com/dotnet/runtime/pull/36715/files
I think of this story every time I see a statistic about how much LLMs have "increased the productivity" of a developer
Don't be too hard on AI, it can delete code too!
https://forum.cursor.com/t/cursor-yolo-deleted-everything-in...
I love the way the "Community Ambassador" steps in and offers solutions to this problem after it has happened.
Or the current industry favourite, “X% of our new code is now written by AI!”
Microsoft, the number being 30%; whether that's accurate is another matter. Twenty years ago people already used IDEs to generate boilerplate code (remember Java's getters/setters/hashCode/toString?) because some guy in a book said you had to.
I use AI to simplify code. My manifesto has always been code is debt. Works really well too.
In Google's case, outside of LLMs I've always wondered how much code was generated by the protocol buffer compiler.
LOL. There was a time when people were excoriated for committing generated object code into version control..
Including the cost to build and maintain new nuclear power plants takes developers' efficiency into absurdity.
About 1.5 years ago I inherited a project with ~ 250,000 lines of code - in just the web UI (not counting back end).
The developer who wrote it was a smart guy, but he had never worked on any other JS project. All state was stored in the DOM in custom attributes, .addEventListeners EVERYWHERE... I joke that it was as if you took a monk, gave him a book about javascript, and then locked him in a cell for 10 years.
I started refactoring pieces into web components, and after about 6 months had removed 50k lines of code. Now knowing enough about the app, I started a complete rewrite. The rewrite is about 80% feature parity, and is around 17k lines of code (not counting libraries like Vue/pinia/etc).
So, soon, I shall have removed over 200,000 loc from the project. I feel like then I should retire as I will never top that.
> The rewrite is about 80% feature parity, and is around 17k lines of code (not counting libraries like Vue/pinia/etc).
This is exactly where these comparisons break down. Obviously you don't need as much code to get passable implementations of a fraction of all the features.
It's definitely a good argument for not reinventing the wheel though.
I'd rather have 250,000 lines of code but 230,000 of that is in battle tested libraries. And of which only 20,000 lines are what we ever need to read/write.
1 reply →
>> is about 80% feature parity, and is around 17k lines of code
You make a fair point that a basic framework can be expressed with much less code.
And that the remaining 20% probably contains more edge cases with proportionally more code.
But do you think the last 20% will eventually make up anywhere near 233k lines of code?
The real save here comes from rewriting: seeing all the common denominators and knowing what's ahead.
I mean, you can get basic implementations of Vue and state management libs in a few hundred (maybe thousand?) LOCs (lots of examples on the interweb) that are probably less "toyish" than whatever this person had handrolled
> I joke that it was as if you took a monk, gave him a book about javascript, and then locked him in a cell for 10 years.
I've had a similar experience (see other comment), the original author was a junior developer at best, but unfortunately, a middle-aged, experienced developer, one of the founders of the company, and very productive. But obviously, not someone who had ever worked in a team or who had someone else work on their codebase.
Think functions thousands of lines long, nested switch/case/if/else/ternary things ten levels deep, concatenated SQL queries (it was PHP because of course), concatenated JS/HTML/HTML-with-JS (it was Dojo front-end), no automated tests of any sort, etc.
A long time ago I was working in a big project where the PLs came up with the most horrible metric I've ever seen. They made a big handwritten list, visible for the whole team, where they marked for each individual developer how many bugs they had fixed and how many bugs they had caused.
I couldn't believe my eyes. I was working in my own project beside this team with the list, so thankfully I was left out of the whole disaster.
A guy I knew wasn't that lucky. I saw how he suffered from this harmful list. Then I told him a story about the Danish film director Lars von Trier I recently had heard. von Trier was going to be chosen to appear in a "canon" list of important Danish artists that the goverment was responsible for. He then made a short film where he took the Danish flag (red with a white cross) and cut out the white lines and stitched it together again, forming a red communist flag. von Trier was immediately made persona non grata and removed from the "canon".
Later that day my friend approached the bugs caused/fixed list, cut out his own line, taped it together and put it on the wall again. I never forget how a PL came in the room later, stood and gazed at the list for a long time before he realized what had happened. "Did you do this?" he asked my friend. "Yes", he answered. "Why?", said the PL. "I don't want to be part of that list", he answered. The next day the list was gone.
A dear memory of successful subversion.
I'm having a lot of trouble visualizing both the flag and the list modifications.
The danish flag is a white cross on a red background. If you cut out the white cross, you will be left with four rectangles of red, which can be pushed together and sewn up again, forming a solid red flag
2 replies →
Took me two reads but he cut his line out of the list, taped it back together and replaced the list on the wall, without his line.
> "I don't want to be part of that list"
Simple, to the point, love it. "I'm not playing your stupid management games".
In the days when perl was the language of choice for the web I got a 97% reduction in code size. I was asked to join a late project to speed it up. (Yes I know that has low success rate).
The lead dev was a hard core c programmer and had no perl experience before this job. He handed me a 200 line uncommented function that he wrote and was not working. It was a pattern matcher. I replaced it with 6 lines of commented perl with regex that was very readable (for a regex).
Since he had no idiomatic understanding of perl he did not accept it and complained to management. We had to bring in the local perl demigod to arbitrate(at 21 was half my age at the time, but smart as a whip). Ruled in my favor and the lead was pissed.
Was he unaware of regex.h?
https://www.man7.org/linux/man-pages/man3/regcomp.3p.html
Doing regex in C back in the day was not very common and far from idiomatic, unlike perl where its basically expected that you cram regexes in anywhere you can.
Before a recent annual performance review, I looked over my stats in the company monolith repo and found out I was net-negative on lines of code. That comes mostly from removing auto-generated API code and types (the company had moved to a new API and one of my projects was removing v1) but it was quite funny to think about going to work every day to erase code.
In an old ops role we had a metric called ticket touches. One of my workmates had almost double of everyone else but only for that metric. We had a look and it was due to how he wrote notes in tickets instead of putting all his findings in a comment he would do it incrementally as he went along. Neither of these ways were wrong it just inflated that stat for him.
"Every line of code not written is a correct one".
One of the early Ruby Koans, IIRC, circulated on comp.lang.ruby around 2002
Should have said “Every line of Ruby code not written is a correct one”
I nominate Java for that particular category
Hi,
I think I have mentioned this before in HN too. I am not from CS background and just learnt the trade as I was doing the job, I mean even the normal stuff.
We have a project that tries reify live objects into human readable form. Final representation is so complicated with lot of types and the initial representation is less complicated.
In order to make it readable, if there is any common or similar data nodes, we have to compare and try to combine them i.e. find places that can be made into methods and find the relevant arguments for all the calls (kind of).
Initial implementation did the transformation into the final form first, and then started the comparison. So, the comparison have to deal with all the different combinations of the types we have in final representation now, which made the whole thing so complex and has been maintained by generation of engineers that nobody had clear idea how it was working.
Then, I read about hashmap implementation later (yep, I am that dumb) and it was a revelation. So, we did following things:
1. We created a hash for skeleton that has to remain the same through all the set of comparisons and transformation of the "common nodes", (it can be considered as something similar to methods or arguments) and doing the comparison for nodes with matching skeletal hashes and
2. created a separate layer that does the comparison and creating common nodes on initial primitive form and then doing the transformation as the second layer (so you don't have to deal with all types in final representation) and
3. Don't type. Yes. Data is simplest abstraction and if your logic can made into data or some properties, please do yourself a favor and make them so. We found lot of places, where weird class hierarchies can be converted into data properties.
Basically, it is a dumb multi pass decompiler.
That did not just speed up the process, but resulted in much more readable and understandable abstractions and code. I do not know, if this is widely useful but it helped in one project. There is no silver bullet, but types were actual problem for us and so we solved it this way.
There are still companies asking for # of lines of code written https://x.com/tregoning/status/1286329086176976896
This is great actually! It's short-circuit evaluation for me to not waste my time applying.
Just today I commited a +0-2kLOC change, removing two months of my coworker's contribution, because it had to be rewritten. Best feeling ever.
Adding that code was not a waste even. You don't have to work every line of code like a mule. Code ...is... thinking.
I am net negative in lines of code added to two different companies I've worked for. I wear that proudly.
Great collection of stories! Thanks for sharing. I got carried away across the pages and relished the quotes page.
The ideals probably worked for that time and that place. Many places in other parts of the world and at other times, would have different ideals, to deal with different priorities at that time and place. America in the 80's had no survival struggle, wars, cultural stigmas, pandemics or famines. Literacy and business were blooming. Great minds and workers were lured with great promises. A natural result is accelerated innovation. Plenty of food and materials. Individualism, fun and luxury was the goal for most. The businesses delivered all of it. Personal computing was an exact fit for that business.
I am currently working on a piece of code which I am actively trying to simplify and make smaller. It's not a piece of code which has any business ever getting larger or having more features. The design is such that it is feature complete until a whole system redesign is required at which point the code would be itself wholesale replaced. So I am sitting here trying to codegolf the code down in complexity and size. It's not just important to keep it simple, it's also important to keep it small as this bit of code is, as part of this solution, going to be executed using python -c. All the while not taking the piss and making it unreadable (I can use a minifier for that).
Rewrite it in Rust, haha.
A more realistic end of the story would be that the form refused negative numbers so he had to put 0 and got fired.
This being 1982 I’ve never even considered that the form could be anything but paper.
It being 1982 and a story about the lead developer of LisaGraf, working on that very thing, it is certainly unlikely to be a GUI form.
But block mode terminals that did forms had been a thing for over a decade at that point. Not that this was likely at Apple. But there are definitely contemporary ways in which one could have been entering this stuff via a computer.
Indeed, an IBM 3270 could be told that a field was numeric. This wouldn't have the terminal prevent negative numbers. The host would have to have done that upon ENTER. But the idea of unsigned numbers in form data had been around in (say) COBOL PIC strings since the 1960s.
* https://ibm.com/docs/en/cics-ts/5.6.0?topic=terminals-3270-f...
No, this was Bill Atkinson. He's famous enough to have his own Wikipedia page: https://en.wikipedia.org/wiki/Bill_Atkinson
> Bill Atkinson, the author of Quickdraw and the main user interface designer, who was by far the most important Lisa implementer...
> I'm not sure how the managers reacted to that, but I do know that after a couple more weeks, they stopped asking Bill to fill out the form, and he gladly complied.
Notice that it doesn't say "they stopped using the form" but "they stopped asking Bill to fill out the form". The rules are different at the top, they probably still used it to mis-manage junior employees who didn't have as much influence.
He put in -2000, and it recorded it as 4294965296
in 1982, I'd expect 63536
Got laid off recently due to basically this.
I think lines of code could be an interesting and valuable metric.
If the lower (negative) score, the better (given a fixed set of features).
“Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.” ― Antoine de Saint-Exupéry, Airman's Odyssey
That just encourages bad behaviour in the other direction though. A massive multi-level nested ternary on one line is usually going to be worse than a longer but clearer set of conditions. Trying to make code brief can be good, but it can often result in an unmaintainable and hard to read mess.
That results in the same AST, it's not really taking anything away
1 reply →
Halstead complexity ([0]) might interest you. It's a semi-secret sauce based on the count of operators and operands that tells you if your function is "shitty".
[0]: https://en.wikipedia.org/wiki/Halstead_complexity_measures
You'll never get it right though. Focusing on "features per LOC" means people will write shorthand, convoluted code, pick obscure, compact languages, etc. To use an adage, if a metric becomes a target, it stops being a useful metric.
An opinionated formatter plus a smart line counter fix this problem. There's still space for abuse but not enough to overshadow genuine improvements.
2 replies →
I often have a mental picture of the thing I need, I start writing it, get a bit "stuck" on architecture and think I could be using a ready made library for this. I find one or a few of them, look at the code (which is obviously more generic) and realize it's many times as large as I thought the entire project should be. With few exceptions the train of thought doesn't even reach the "Do I want to carry this around and baby sit it?" stage. Some how this continues to surprise me every time.
These 5 lines are probably my favorite example.
https://jsfiddle.net/gaby_de_wilde/c8bhcatj/7/
We had a VP recommend "lines of code" as part of performance system. I showed some CVS stats of some of our key developers; one of our best performers was at -20K lines of code over 5 years.
Would be really interesting if microservices were limited to a manageable amount of code. I wonder if technical leadership has ever guided their org devs that services shouldn't be more than ___ lines of code (excluding tests). Obviously, more services would mean more latency, but I'm pretty sure this one I'm working on now that is over 11k LOC (ignoring comments) could be replicated with a few bash commands.
That sounds even worse honestly.
One of the criticisms of microservices is that factoring a system correctly is already a hard problem, and introducing a network call between them makes it even harder.
Enforcing service LoC limits is equivalent to forcing further factoring of a system, which might not be necessary, especially not into a microservice arch.
Sometimes code is tightly coupled because it needs to be tightly coupled.
This being Lisa that's -2000 lines in 68k assembler. That's about as verbose as any real PL can ever get.
For what it's worth, here's quicksort in 5 lines of haskell https://stackoverflow.com/questions/7717691/why-is-the-minim...
That's not quicksort, though, because it's not in place; the actual quicksort on that page is in https://stackoverflow.com/a/7833043, which is 11 lines of code. That's still pretty concise. My own preferred concise, or at least terse, presentation of quicksort is http://canonical.org/~kragen/sw/dev3/paperalgo#addtoc_20.
How long would a quicksort (say, of integers) be in 68000 assembly? Maybe 20 lines? My 68000 isn't very good. The real advantage of writing it in Haskell is that it's automatically applicable to anything that's Ord, that is, ordered.
> How long would a quicksort (say, of integers) be in 68000 assembly?
About 70 lines, once you strip out the comments and blank lines.
https://github.com/historicalsource/supermario/blob/9dd3c4be...
3 replies →
Not true quicksort though :)
That's the problem with comparing lines of code: you're comparing apples and oranges. In this case you aren't even solving the same problem.
Are... are you comparing quicksort to... Quickdraw?
Lol - ok that's genuinely funny :). slow clap
> For what it's worth, here's quicksort in 5 lines of haskell
QuickDraw was a graphics library, not a sorting algorithm
If you think 68k assembler is 'verbose' you haven't seen x86 yet ;)
When using Claude for production code in my core code base, I'm much more picky about _how_ it's written (as opposed to one-off or peripheral code that just needs to work). I find much of my work involves deleting Claude's code and simplifying its logic. (I'm still not sure I'm saving any time in the end vs writing such code from scratch.)
Code is not an asset, it's a liability.
> I'm not sure how the managers reacted to that, but I do know that after a couple more weeks, they stopped asking Bill to fill out the form, and he gladly complied
It would be a much better story if they stopped asking everyone to fill out the form.
I did this the other day in an old component that had been refactored several times. It's incredibly satisfying. IMO removing code is often more important than adding it because it helps so much with maintainability and future development speed.
yesterday I deleted 2500 lines out of a project of 4300 lines lol.
I have mixed feelings because now it's so much simpler, but the frustration of having to write these lines in the first place, it's so annoying. that's what happens when specs aren't clear
This is how you turn unwanted dependencies and inability to make string searches a virtue.
Lines of code is a byproduct, not the goal.
Code is an artifact, undesired debris.
The fewer lines, the better.
Well, like everything it's probably a balance to strike. Otherwise you may end up with highly golfed IOCCC-style code in production, which I would definitely not recommend.
I've seen advent of code one-liners, and sometimes more lines is better than a one-liner.
Yes, my thinking here isn't about the semantics of code, whether it is written in dense or terse form, but about the more meta level - code itself is something we write to achieve some goal, it shouldn't be the goal by itself. As an example, as a company is audited for their valuation, it is not the code amount itself that is valuable. In many ways it can be seen as a burden, as you'll need to maintain code over time.
One habit I've gotten into is trying to aim for negative LOC pull requests on occasion.
It has the added benefit that I'm forced to keep the codebase fresh in my mind.
Just because lines of code is being reported it didn't mean that bigger automatically means better. It does tell a story about how one is spending their time though.
This is one of those stories that I am sure has happened, but when it comes to "and then they never asked him again le XD face" it's clearly just made up.
Bill Atkinson recently died and there’s a great HN discussion about him. He had a good relationship with Steve Jobs; it’s reasonable to assume it’s true that he got left alone, especially if Andy Hertzfeld is the person making the assertion.
1. The site is called folklore.org. You’re sort of saying the site is true to its name.
2. It’s a direct recollection from someone who was there, not an unnamed “my cousin’s best friend” or literal folklore that is passed down by oral tradition. Andy knew Bill and was there. There is no clear motivation to tell a fictional story when there were so many real ones.
3. The specifics line up very well with what we know about Bill Atkinson and some his wizardry needed to make the Mac work.
Given this, it’s much easier to assume that your assertion is what is made up.
management could have decided on a process change. Simple as that.
I get the sentiment though, "He blew management's mind so much they made an exception for him".
But, Folklore.org is a bit less onanistic than ESR's jargon file.
I've pulled stunts like this that makes management realize its easier to make an exception than to fight it
We had free soft drinks in the fridges at one place I worked. Cost-cutting measures were coming and I sent an email to all of engineering (including the VP) asking who wanted to join me in a shopping trip at 10AM to restock the fridge. In the email, I estimated that it would take between 60 and 90 minutes. Two carfuls of engineers left at 10AM sharp and returned a little before noon and restocked the fridges.
That was the first and last time we had to do it, as the soft drinks returned the following week.
4 replies →
It was Bill fucking Atkinson. Not a disposable random contractor you hire by the dozen when you need to build more CRUD APIs.
At that time at Apple, even as an IC, Bill had lines of communication to Steve and was extremely valued. There's absolutely no doubt he could get "middle manager shenanigans" gone simply by not complying or "maliciously complying". Hell, I've seen ICs far less valuable, or even close to negative value get away with stunts far worse than these, succeed and keep their jobs. Out of all the stories in Folklore.org, this is the one you have an issue with?!
Most people know who Bill Atkinson on this forum. The story premise that he wrote negative code isn't my gripe, I am sure it happened.
The outcome where all of a sudden leadership just shit its pants and doesn't communicate at all and never followed up... It's like writing "and then everyone clapped" for programmers.
3 replies →
this must be paired with the '350k lines / then he discovered loops' post from the same site
https://folklore.org/Discovered_Loops.html
Some theories will cause you have a negative performance review and will be fired haha
Word on the street is that non-existent code doesn't crash.
A manager asking you to tell them how many lines of code you wrote is peak middle-management. Like what are you even doing, you count them.
This just depresses me. So many programmers back then spent time optimising algorithms. Now it's slop city.
I've been playing with Claude 4 Sonnet in VS Code and found it quite good. As part of the development plan, on its own it had included an optimization phase where it profiled my go code, identified some hot spots, and proposed ways to optimize it. For the most critical area, it suggested using a prefix tree which it wrote on the spot, adapted to my code, wrote benchmarks to compare the old and new version (5x improvement). Then it wrote specific tests to make sure both versions behaved the same. Then it made sure all my other tests passed and wrote a report on everything.
There were three performance optimizations in total, one which I rejected because the gain was minimal for typical use case and there are still some memory allocation optimization which I have deferred with because I'm in the middle of a major refactor of the code. The LLM has already written down plans to restart this process later when I more time.
I’d be wary of doing any but the most obvious optimisations without profiling. This goes for humans or AI.
NoScript link: https://www.folklore.org/Negative_2000_Lines_Of_Code.html
Thanks. @dang update URL?
[dead]
[dead]
Software metric are hard, indeed :) Be prepared in a ai-code world when more code does not mean better code.
I've been watching my colleagues' adoption of Copilot with interest. From what I can tell, the people who are the most convinced that it improves their productivity have an understanding of developer productivity that is very much in line with that of the managers in this story.
Recently I refactored about 8,000 lines of vibe-coded bloat down into about 40 lines that ran ten times as fast, required 1/20 as much memory, and eliminated both the defect I was tasked with resolving and several others that I found along the way. (Tangentially, LLM-generated unit tests never cease to amaze me.) The PHBs didn't particularly appreciate my efforts, either. We've got a very expensive Copilot Enterprise license to continue justifying.
I see a stratified software market in the future.
There will be vibe and amateur banged out hustle trash, which will be the cheap plastic cutlery of the software world.
There will be lovingly hand crafted by experts code (possibly using some AI but in the hands of someone who knows their shit) that will be like the fine stuff and will cost many times more.
A lot of stuff will get prototyped as crap and then if it gets traction reimplemented with quality.
13 replies →
I don’t believe your numbers unless your colleagues are exceptionally bad programmers.
I’m using AI a lot too. I don’t accept all the changes if they look bad. I also keep things concise. I’ve never seen it generate something so bad I could delete 99 percent of it.
5 replies →
Every now and then, in between reasonable and almost-reasonable suggestions, Copilot will suggest a pile of code, stylistically consistent with the function I’m editing, the extends clear off the bottom of the page. I haven’t been inspired to hit tab a couple times and try to reverse engineer the resulting vomit of code, but I can easily imagine a new programmer accepting the code because AI! or, perhaps worse, hitting tab without even noticing.
"8,000 lines of vibe-coded bloat down into about 40 lines" ... I just saw a vision of my future and shuddered.
I mean, I like killing crappy code as much as the next guy, but I don't want that to be my daily existence. Ugggh.
> Tangentially, LLM-generated unit tests never cease to amaze me.
In a good or bad way?
I've found AI pretty helpful to write tests, specially if you already have an existing one as a template.
1 reply →
I would love to know the time balance between the two activities. It takes nothing to generate slop, but could be weeks to extricate it.
This is also true for human code, more often than not.
[flagged]
[dead]