← Back to context

Comment by simonw

10 hours ago

> Why use someone's project when you can just have the robot write your own?

I've been thinking about this a bunch recently, and I've realized that the thing I value most in software now isn't robust tests or thorough documentation - an LLM can spit those out in a few minutes. It's usage. I want to use software which other people have used before me. I want them to have encountered the bugs and sharp edges and sanded them down.

Depth of use over the lifetime of an app is a quality all its own that often not appreciated. A recurring pattern at $dayjob is that a new manager or director will join a business unit and declare an existing app as the worst terrible, no good, horrible app they've seen and they're going to fix that. A year and a half later the new app is finally delivered with 80% of the original functionality and a fresh set of bugs. The new dev team sees the surface functionality but misses a lot of the hard earned nuance the old system accrued over time. This is a pattern that existed long before LLMs.

An LLM most definitely cannot spit out robust tests or thorough documentation. It can spit out some tests or some documentation, but they will not cover the user perspective or edge cases unless those are already documented somewhere. That's verified by both experience and just thinking about it for two seconds.

The sanding down you refer to is what generates those tests and documentation.

  • > but they will not cover the user perspective or edge cases unless those are already documented somewhere

    Are you suggesting that LLM's can't test for people who use screen readers? Keyboard only users? Slow network requests?

    You're acting like the issues an app faces are so bespoke to the actual app itself (and have absolutely no relation to existing problems in this space) that an LLM couldn't possibly cover it. And it's just patently wrong.

    • I'm not talking about keyboards or screen readers or any sort of input testing, I'm talking about how the software is used in practice.

      If you disagree with that, I think the onus is on you to show me that an LLM could simulate the full context in which a user interfaces with software. That's a ridiculous claim.

      Feel free to show literally any evidence for this claim.

    • >Are you suggesting that LLM's can't test for people who use screen readers? Keyboard only users? Slow network requests?

      I don't think it's feasible to fully simulate the full depth of actual usage, given that (especially in the case of screen readers and the like) there's a great deal of combinatorial depth and context to the problem. Which screen readers, on which operating systems, and which users thereof?

I feel similarly but IIUC I think that doesn’t strictly require an open source development model. I’ve benefited a huge amount from consuming and contributing to open source projects and I’m a bit worried that the “unit economics” changing might break some of the social dynamics upon which the ecosystem is built.

> he thing I value most in software now isn't robust tests or thorough documentation - an LLM can spit those out in a few minutes.

Can it if we stop defining "robust tests" as "a lot of test code lines" and "good documentation" as "lengthy documentation"?

  • I chose my words carefully. "Robust tests" are tests that provide high coverage and aren't flaky. "Thorough documentation" likewise is documentation that describes as much of the code as possible.

    I didn't use the word good.

Yep. I realised the same. No one reads docs, or goes through tests. Either ways it's easy to write useless tests. And easy to write useless docs. Idt most even read the code. Now the difference is that it has become possible to write useless code.

So it's just the fact that others have already gone through the motions before I did. That's it really. I suppose in commercial settings, this is even more true and perhaps extends to compliance.

> an LLM can spit those out in a few minutes.

It may be able to spit out text that purports to be that, in a few minutes. But for most software, an LLM will not be able to spit out robust tests - let alone useful documentation. (And documentation which just replicates the parameter names and types is thorough...ly useless.)

I value software that reveals knowledge. The frontier LLMs were trained on all the code that institutions had been keeping to themselves. So they're revealing programing know-how on a scale that just wasn't possible with open source. LLMs are the ultimate Prometheus. Information is more accessible and useful now than it's ever been.

  • > The frontier LLMs were trained on all the code that institutions had been keeping to themselves.

    Lolz! I haven’t encountered “code that institutions had been keeping to themselves” that got even remotely close to OSS in quality.

  • I promise you, "the code that institutions had been keeping to themselves" is not nearly as special or good as you are implying here.

    • True.

      I have worked during several decades in many companies, located in many countries, in a few continents, from startups to some of the biggest companies in their fields. Therefore I have seen many proprietary programs.

      On average, proprietary programs are not better than open-source programs, but usually worse, because they are reviewed by fewer people and because frequently the programmers who write them may be stressed by having to meet unrealistic timelines for the projects.

      The proprietary programs have greater quantity, not quality, by being written by a greater number of programmers working full-time on them, while much work on open-source projects is done in spare time by people occupied with something else.

      Many proprietary programs can do things which cannot be done by open-source programs, but only because of access to documentation that is kept secret in the hope of preventing competition.

      While lawyers, and other people who do not understand how research and development is really done, put a lot of weight in the so-called "intellectual property" of a company, which they believe to be embodied in things like the source code of proprietary programs or the design files for some hardware, the reality is that I have nowhere seen anything of substantial value in this so-called IP. Everywhere, what was really valuable in the know-how of the company was not the final implementation that could be read in some source code, but the knowledge about the many other solutions that had been tried before and they worked worse or not at all. This knowledge was too frequently not written down in any documentation. Knowing which are the dead ends is a great productivity boost for an experienced team, because any recent graduate could list many alternative ways of solving a problem, but most of them would not be the right choice in certain specific circumstances.

      1 reply →

    • The claim is also just categorically untrue. The largest source of training data by far is publicly available code on e.g. Github, so it mostly just gives you a way to recycle already-available code, without crediting the author, while allowing you to pretend you own it.

      1 reply →