Comment by fdr

14 hours ago

I use beads quite a bit, but not as steve intended. And definitely the opposite of "Gas Town," where I use the note-taking capability and integration with Git (that is, as something of a glorified Makefile and database) to debug contexts, to close the loop and increase accuracy over time. Nevertheless, it has been useful for large batch runs over my code base: the record has been processing for thirty hours straight while getting something useful, and enough trace data to make further improvements.

Steve has gone "a bit" loopy, in a (so far) self aware manner, but he has some kind of insight into the software engineering process, I think. Yet, I predict beads will break under the weight of no-supervision eventually if he keeps churning it, but some others will pick up where he left off, with more modest goals. He did, to his credit, kill off several generations of project before this one in a similar category.

His latest post is endorsing a crypto exchange because they paid him $50k.

https://steve-yegge.medium.com/bags-and-the-creator-economy-...

  • I’m pro LLM and use them, but crikey: if they’re so good at code why aren’t these people with all the attention, branding, and connections in the world unable to capitalize them?

    I believe Google that uses their internal Gemini trained on their internal infrastructure to generate boiler plate and insights for older, less mature, code in one of the worlds biggest and most complicated anythings, ever. But I don’t see them saying anything to the effect of “neener neener, we’re using markov chains so 10x our stock ‘cause of the otherwise impossible face melting Google Docs 2026.”

    OpenAI is chasing ads, like Reddit, to regurgitate Reddit content. If this stuff is worth the squeeze I need to see the top 10 LLM-fluencers refusing to bend over for $50K. The opposite is on display.

    So hypotheses: Google’s s-tier geniuses and PMs are already expressing the mature optimum application. No silver bullets, more gains to be had ditching bad tech and extraneous vendor entanglements (copilot, 365).

  • That entire article sounds like my friends who think AI is real and keep sending their parents money into crypto scams.

    I think I’ll just develop a drinking problem if this is Gas Town becomes something real in the industry and this kind of person is now one of our thought leaders.

To be fair, he's always been a little loopy. At least, I think this post of his was loopy: https://steve-yegge.blogspot.com/2007/06/that-old-marshmallo...

It was also one of my favorite posts of his and has aged incredibly well as my experience has grown.

  • that's one reason I am less worried about him than some, although, I don't want to say that only to have something bad happen to him, that is, a form of complacency. Just because (say) Boltzmann and Cantor had useful insights along the way didn't mean people shouldn't have been looking to support them.

> but some others will pick up where he left off, with more modest goals

Already happening :-) https://github.com/Dicklesworthstone/beads_rust

  • the main area I'd like to see some departure from beads is to use markdown files (or something) to be able to see the issue context/comments better in a diff generated by git.

    The other area I'd like to see some software engineering thinking that's more open ended is on regression testing: ways of storing or referencing old versions of texts to see if the agent can complete old transformations properly even with a context change that patches up a weakness in a transformation that is desirable. This is tricky as it interacts with something essential in software engineering, the ability to run test suites and responding to the outcome. I don't think we know yet when to apply what fidelity of testing, e.g. one-shot on snippets versus a more realistic test based on git worktrees.

    This is not something you'd want for every context, but a lot of my effort is spent building up prompt fragments to normalize and clean up the code coming out of a model that did some ad-hoc work that meets the test coverage bar, which constrains it decently into having achieved "something." Kind of like a prototype. But often, a lot of ungratifying massaging is required to even cover the annoying but not dangerous tics of the LLM, to bring clarity to where it wrote, well, very bad and unprincipled code...as it does sometimes.

  • I was disappointed to see that this is still 10x the code needed for the feature set and that it still insists on duplicating state into a SQLite index for such minuscule amounts of data.

    I've seen 25-30 similar efforts to make a Beads alternative and they all do this for some reason.