Comment by KronisLV

17 hours ago

> We know how to write software with very few bugs (although we often choose not to)

Do we, really? Because a week doesn’t go by when I don’t run into bugs of some sort.

Be it in PrimeVue (even now the components occasionally have bugs, seems like they’re putting out new major versions but none are truly stable and bug free) or Vue (their SFC did not play nicely with complex TS types), or the greater npm ecosystem, or Spring Boot or Java in general, or Oracle drivers, or whatever unlucky thread pooling solution has to manage those Oracle connections, or kswapd acting up in RHEL compatible distros and eating CPU to a degree to freeze the whole system instead of just doing OOM kills, or Ansible failing to make systed service definitions be reloaded, or llama.cpp speculative decoding not working for no good reason, or Nvidia driver updates bringing the whole VM down after a restart, or Django having issues with MariaDB or just general weirdness around Celery and task management and a million different things.

No matter where I look, up and down the stack, across different OSes and tech stacks, there are bugs. If there is truly bug free code (or as close to that as possible) then it must be in planes or spacecraft, cause when it comes to the kind of development that I do, bug free code might as well be a myth. I don't think everyone made a choice like that - most are straight up unable to write code without bugs, often due to factors outside of their control.

> Do we, really?

Yes, or pretty close to it. What we don't know how to do (AFAIK) is do it at a cost that would be acceptable for most software. So yes, it mostly gets done for (components of) planes, spacecraft, medical devices, etc.

Totally agreed that most software is a morass of bugs. But giving examples of buggy software doesn't provide any information about whether we know how to make non-buggy software. It only provides information about whether we know how to make buggy software—spoiler alert: we do :)

  • > So yes, it mostly gets done for (components of) planes, spacecraft, medical devices, etc.

    I have to disagree here. All of these you mentioned have regularly bugs. Multiple spacecraft got lost because of these. For planes there's not so distant Boeing 737 MAX fiasco (admittedly this was bad software behavior caused by sensor failure). And medical devices, the news about their bugs semi-regularly pop up. So while the software for these might do a bit better than the rest, they certainly are not anywhere close to being bug free.

    And same goes for specifications the software is based on. Those aren't bug-free either. And writing software based on flawed specification will inevitably result in flawed software.

    That's not to say we should give up on trying to write bug free software. But we currently don't know how to do so.

  • There is a huge wetware problem too. Like if I can send you an email or other message that tricks you and gets you to send me $10k, what do I care if the industry is 100% effective at blocking RCE?

  • That software also often has bugs. It's usually a bit more likely that they are documented, though, and unlikely to cause a significant failure on their own.

    • building around bugs that you know exists but dont know where is also a part of it. Reliability in the face of bugs. The mere existence of bugs isn't enough to call the software buggy, if the outcome is reliable (e.g., a triple module redundancy).

      1 reply →

  • Then we can't do it. Cost is a requirement

    • Cost is a parameter subject to engineering tradeoffs, just like performance, feature sets, and implementation time.

      Security and reliability are also parameters that exist on a sliding scale, the industry has simply chosen to slide the "cost" parameter all the way to one end of the spectrum. As a result, the number of bugs and hacks observed are far enough from the desired value of zero that it's clear the true requirements for those parameters cannot be honestly said to be zero.

      3 replies →

    • The question was not if it was possible within price boundary X, but if it was possible at all. There is a difference, please don't confound possibility with feasibility.

    • Is having problematic features that causes problems also a requirement?

      The answer to the above question will reveal if someone an engineer or a electrician/plumber/code monkey.

      In virtually every other engineering discipline engineers have a very prominent seat at the table, and the opposite is only true in very corrupt situations.

      1 reply →

    • Also people keep insisting on using unsafe languages like C.

      It depends on exactly what you are doing but there are many languages which are efficient to develop in if less efficient to execute like Java and Javascript and Python which are better in many respects and other languages which are less efficient to develop in but more efficient to run like Rust. So at the very least it is a trilemma and not a dilemma.

      8 replies →

I think this discussion distracts a bit from the main point.

The main point is that there are super widespread software systems in use that we know aren't secure, and we certainly could do better if we (as the industry, as customers, as vendors) really wanted.

A prime example is VPN appliances ("VPN concentrators") to enable remote access to internal company networks. These are pretty much by definition Internet-facing, security-critical appliances. And yet, all such products from big vendors (be they Fortinet, Cisco, Juniper, you name it) had a flood of really embarrassing, high-severity CVEs in the last few years.

That's because most of these products are actually from the 80s or 90s, with some web GUIs slapped on, often dredged through multiple company acquisitions and renames. If you asked a competent software architect to come up with a structure and development process that are much less prone to security bugs, they'd suggest something very different, more expensive to build, but also much more secure.

It's really a matter of incentives. Just imagine a world where purchasing decisions were made to optimize for actual security. Imagine a world where software vendors were much more liable for damage incurred by security incidents. If both came together, we'd spend more money on up-front development / purchase, and less on incident remediation.

> No matter where I look, up and down the stack, across different OSes and tech stacks, there are bugs.

I’m not sure I’d go quite as far as GP, but they did caveat that we often choose not to write software with few bugs. And empirically, that’s pretty true.

The software I’ve written for myself or where I’ve taken the time to do things better or rewrite parts I wasn’t happy with have had remarkably few bugs. I have critical software still running—unmodified—at former employers which hasn’t been touched in nearly a decade. Perhaps not totally bug-free, but close enough that they haven’t been noticed or mattered enough to bother pushing a fix and cutting a release.

Personally I think it’s clear we have the tools and capabilities to write software with one or two orders of magnitude fewer bugs than we choose to. If anything, my hope for AI-coded software development is that it drops the marginal cost difference between writing crap and writing good software, rebalancing the economic calculus in favor of quality for once.

  • > I’m not sure I’d go quite as far as GP, but they did caveat that we often choose not to write software with few bugs. And empirically, that’s pretty true.

    Blame PMs for this. Delivering by some arbitrary date on a calendar means that something is getting shipped regardless of quality. Make it functional for 80% of use, then we'll fix the remaining bits in releases. However, that doesn't happen as the team is assigned new task because new tasks/features is what brings in new users, not fixing existing problems.

    • I don’t disagree but is the alternative unbounded dev where you write code until it’s perfect? That doesn’t sound like a better business outcome. The trade off can’t be “take as long as you want”

      16 replies →

> Do we, really?

Yes. There’s a ton of lessons learned, best practices, etc. We’ve known for decades.

It’s just expensive and difficult. Since end-users seem to have no issue, paying for crud, why bother?

>>> often due to factors outside of their control.

That’s the beauty of OSS - the level we could write code is way less than the level the culture / timescale / management allows. I recently saw OSS as akin to (good) journalism for enterprise - asking why is this hidden part of society not doing the minimum (jails, corruption etc).

Free software does sooo much better compared to much in-house it is like sunlight

> > We know how to write software with very few bugs

> Do we, really? Because a week doesn’t go by when I don’t run into bugs of some sort.

I mean, we do know how to do it, but we don't because business needs tend to throw quality under the bus in exchange for almost everything else: (especially) speed to develop, but also developer comfort, feature cram, visual refreshes, and so on always trump bugs, so every project ends up with bugs.

I have a few hobby projects which I would stick my neck out and say have no bugs. I know, I'm going to get roasted for this claim, but the projects are ultra simple enough in scope, and I'm under no pressure to ever release them publicly, so I was able to prioritize getting them right. No actual businesses are going to be doing this level of polish and care, and they all need to cut corners and actually ship, so they have bugs. And no ultra-complex project (even if it's done with love and care) is capable of this either, purely due to its size and number of moving parts.

So, it's not like we don't know how to do it, but that we choose not to for practical reasons.

  • The simplest recipe for writing "almost bug-free" software is:

      1.  Freeze the set of features.
      2.  Continue to pay programmers to polish the software for several years while it is being actively used by many people.
      3.  Resist adding new features or updating the software to feel modern.
    

    If you do that, your program will asymptomatically approach zero bug.

    Of course, your users will complain about missing features, how ugly and ancient your products look, and how they wished you were more like your buggy competitors.

    And if your users are unhappy, then you probably lose the "used heavily by a lot of people" part that reveals the bugs.

    • There is no system without exploitable breaches, whether technical or social ones. The biggest point is, who have the incitives to exploit them, how much resources it costs to run a trial, how much resources do they control and are they ready to throw at attempts.

We do.

The issue is almost always feature management.

Back in the days I was making Flash games, usually a 3-5 weeks job, with no real QA, and the project was live for 3-5 months. Every time I was ahead of schedule someone came with a brilliant idea to test few odd things and add couple new features that was not discussed prior. Sometimes literally hours before the launch.

Every time I was making the argument that adding one new feature will create two bugs. And almost always I was right about it.

Fast forward and I'm working for BigCo. Few gigs back I was working for a major bank which employed supper efficient and accountable workflow - every release has to be comprised of business specific commits, and commits that are not backed by explicit tickets are not permitted.

This resulted in team having to literally cheat and lie to smuggle refactors and optimizations.

Add to that that most enterprise projects start not because the requirements were gathered but because the budget was secured and you have a recipe for disaster.