← Back to context

Comment by mark242

19 days ago

I would argue that it's going to be the opposite. At re:Invent, one of the popular sessions was in creating a trio of SRE agents, one of which did nothing but read logs and report errors, one of which did analysis of the errors and triaged and proposed fixes, and one to do the work and submit PRs to your repo.

Then, as part of the session, you would artificially introduce a bug into the system, then run into the bug in your browser. You'd see the failure happen in browser, and looking at Cloudwatch logs you'd see the error get logged.

Two minutes later, the SRE agents had the bug fixed and ready to be merged.

"understand how these systems actually function" isn't incompatible with "I didn't write most of this code". Unless you are only ever a single engineer, your career is filled with "I need to debug code I didn't write". What we have seen over the past few months is a gigantic leap in output quality, such that re-prompting happens less and less. Additionally, "after you've written this, document the logic within this markdown file" is extremely useful for your own reference and for future LLM sessions.

AWS is making a huge, huge bet on this being the future of software engineering, and even though they have their weird AWS-ish lock-in for some of the LLM-adjacent practices, it is an extremely compelling vision, and as these nondeterministic tools get more deterministic supporting functions to help their work, the quality is going to approach and probably exceed human coding quality.

I agree with both you and the GP. Yes, coding is being totally revolutionized by AI, and we don't really know where the ceiling will be (though I'm skeptical we'll reach true AGI any time soon), but I believe there still an essential element of understanding how computer systems work that is required to leverage AI in a sustainable way.

There is some combination of curiosity of inner workings and precision of thought that has always been essential in becoming a successful engineer. In my very first CS 101 class I remember the professor alluding to two hurdles (pointers and recursion) which a significant portion of the class would not be able to surpass and they would change majors. Throughout the subsequent decades I saw this pattern again and again with junior engineers, bootcamp grads, etc. There are some people no matter how hard they work, they can't grok abstraction and unlock a general understanding of computing possibility.

With AI you don't need to know syntax anymore, but to write the write prompts to maintain a system and (crucially) the integrity of its data over time, you still need this understanding. I'm not sure how the AI-native generation of software engineers will develop this without writing code hands-on, but I am confident they will figure it out because I believe it to be an innate, often pedantic, thirst for understanding that some people have and some don't. This is the essential quality to succeed in software both in the past and in the future. Although vibe coding lowers the barrier to entry dramatically, there is a brick wall looming just beyond the toy app/prototype phase for anyone without a technical mindset.

  • I can see why people are skeptical devs can be 10x as productive.

    But something I'd bet money on is that devs are 10x more productive at using these tools.

    • I get its necessary for investment, but I'd be a lot happier with these tools if we didn't keep making these wild claims, because I'm certainly not seeing 10x the output. When I ask for examples, 90% its claude code (not a beacon of good software anyway but if nearly everyone is pointing to one example it tells you thats the best you can probably expect) and 10% weekend projects, which are cool, but not 10x cool. Opus 4.5 was released in Dec 2025, by this point people should be churning out year long projects in a month, and I certainly haven't seen that.

      I've used them a few times, and they're pretty cool. If it was just sold as that (again, couldn't be, see: trillion dollar investments) I wouldn't have nearly as much of a leg to stand on

      24 replies →

    • Like others are saying, AI will accelerate the gap between competent devs and mediocre devs. It is a multiplier. AI cannot replace fundamentals, at least not a good helmsman with a good rational, detail-oriented mind. Having fundamentals (skill & knowledge) + using AI will be the cheat code in the next 10 years.

      The only historical analogue of this is perhaps differentiating a good project manager from an excellent one. No matter how advanced, technology will not substitute for competence.

    • At the company I work for, despite pushing widespread adoption, I have seen exactly a zero percent increase in the rate at which major projects get shipped.

      3 replies →

    • I view the current tools as more of a multiplier of base skill.

      A 1x engineer may become a 5x engineer, but a -1x will also produce 5x more bad code.

      1 reply →

    • > But something I'd bet money on is that devs are 10x more productive at using these tools.

      If this were true, we should be seeing evidence of it by now, either in vastly increased output by companies (and open source projects, and indie game devs, etc), or in really _dramatic_ job losses.

      This is assuming a sensible definition of 'productive'; if you mean 'lines of code' or 'self-assessment', then, eh, maybe, but those aren't useful metrics of productivity.

  • It is tempting to think that we can delegate describing the mental model to AI, but it seems like all of this boils down to humans making bets, and it also seems like the fundamental bets engineers are making are about the formalisms that encode the product and make it valuable.

  • What an awful professor! When I first tried to learn pointers, I didn't get it. I tried again 6 months later and suddenly it clicked. The same thing happened for another guy I was learning with.

    So the professor just gaslit years of students into thinking they were too dumb to get programming, and also left them with the developmental disability of "if you can't figure something out in a few days, you'll never get it".

  • I don’t think there will be an “AI native” generation of developers. AI will be the entity that “groks pointers” and no one else will know it or care what goes on under the hood.

Speaking as someone who has been both a SRE/DevOps from all levels from IC to Global Head of a team:

- I 100% believe this is happening and is probably going to be the case in the next 6 months. I've seen Claude and Grok debug issues when they only had half of the relevant evidence (e.g. Given A and B, it's most likely X). It can even debug complex issues between systems using logs, metrics etc. In other words, everything a human would do (and sometimes better).

- The situation described is actually not that different from being a SRE manager. e.g. as you get more senior, you aren't doing the investigations yourself. It's usually your direct reports that are actually looking at the logs etc. You may occasionally get involved for more complex issues or big outages but the direct reports are doing a lot of the heavy lifting.

- All of the above being said, I can imagine errors so weird/complex etc that the LLMs either can't figure it out, don't have the MCP or skill to resolve it or there is some giant technology issue that breaks a lot of stuff. Facebook engineers using angle grinders to get into the data center due to DNS issues comes to mind for the last one.

Which probably means we are all going to start to be more like airline pilots:

- highly trained in debugging AND managing fleets of LLMs

- managing autonomous systems

- around "just in case" the LLMs fall over

P.S. I've been very well paid over the years and being a SRE is how I feed my family. I do worry, like many, about how all of this is going to affect that. Sobering stuff.

-

  • > Which probably means we are all going to start to be more like airline pilots:

    Airline pilots are still employed because of regulations. The industry is heavily regulated and the regulations move very slowly because of its international cooperative nature. The regulations dictate how many crew members should be on board for each plane type and other various variables. All the airlines have to abide by the rules of the airspace they're flying over to keep flying.

    The airlines on the other hand along with the technology producers (airbus for example) are pursuing to reduce number of heads in the cockpit. While their recent attempt to get rid of co-pilots in EASA land has failed [1], you can see the amount of pursuit and investment. The industry will continue to force through cost optimization as long as there's no barrier to prevent. The cases where automation has failed will be just a cost of the business, since the life of the deceased is no concern to the company's balance sheet.

    Given the lack of regulation in the software, I suspect the industry will continue the cost optimization and eliminate humans in the loop, except in the regulated domains.

    [1] - https://www.easa.europa.eu/en/research-projects/emco-sipo-ex... ; while this was not a direct push to get rid of all pilots, it's a stepping stone in that direction.

    • It's crazy how many developers are starry-eyed optimists about all of this, just casually assuming that they'll still be highly-paid, well-respected professionals ("we'll be like pilots, mostly monitoring the autopilot") if this technology doesn't hit a wall in the next year or two, despite lacking any of the legal protections that other professions enjoy.

    • Regulation and strong unions are the only thing holding airlines back from doing what the cruise lines did long ago: importing all of their labor from cheaper countries and paying them trash while working them to the bone.

      In the meantime, captains at legacy airlines are the only ones getting paid well. Everyone else struggles to make ends meet. All while airlines constantly compain that they "can't find enough qualified pilots." Where have I heard this said before...

      Also, every pilot is subject to furloughs, which happen every time economic headwinds blow a little too hard, which resets their tenure, and their payscales, if they switch employers.

Now run that loop 1000 times.

What does the code /system look like.

It is going to be more like evolution (fit to environment) than engineering (fit to purpose).

It will be fascinating to watch nonetheless.

  • It'll probably look like the code version of this, an image run through a LLM 101 times with the directive to create a replica of the input image: https://www.reddit.com/r/ChatGPT/comments/1kbj71z/i_tried_th... Despite being provided with explicit instructions, well...

    People are still wrongly attributing a mind to something that is essentially mindless.

  • "evolution (fit to environment) than engineering (fit to purpose)."

    Oh, I absolutely love this lens.

  • Sure, if all you ask it to do is fix bugs. You can also ask it to work on code health things like better organization, better testing, finding interesting invariants and enforcing them, and so on.

    It's up to you what you want to prioritize.

    • I have some healthy skepticism on this claim though. Maybe, but there will be a point of diminishing returns where these refactors introduce more problems than they solve and just cause more AI spending.

      Code is always a liability. More code just means more problems. There has never been a code generating tool that was any good. If you can have a tool generate the code, it means you can write something on a higher level of abstraction that would not need that code to begin with.

      AI can be used to write this better quality / higher level code. That's the interesting part to me. Not churning out massive amounts of code, that's a mistake.

      8 replies →

    • I agree but want to interject that "code organization " won't matter for long.

      Programming Languages were made for people. I'm old enough to have programmed in z80 and 8086 assembler. I've been through plenty of prog.langs. through my career.

      But once building systems become prompting an agent to build a flow that reads these two types of excels, cleans them,filters them, merges them and outputs the result for the web (oh and make it interactive and highly available ) .

      Code won't matter. You'll have other agents that check that the system is built right, you'll have agents that test the functionality and agents that ask and propose functionality and ideas.

      Most likely the Programming language will become similar to the old Telegraph texts (telegrams) which were heavily optimized for word/token count. They will be optimized to be LLM grokable instead of human grokable.

      Its going to be amazing.

      4 replies →

    • Your assuming that scrum/agile/management won't take this over?

      What stakeholder is prioritizing any of those things and paying for it out of their budget?

      Code improvement projects are the White Whale of software engineering - obsessed over but rarely from a business point of view worth it.

      1 reply →

    • My unpopular opinion is AI sucks at writing tests. Like, really sucks. It can churn out a lot of them, but they're shitty.

      Actually writing good tests that exercise the behavior you want, guard against regressions, and isn't overfitted to your code is pretty difficult, really. You need to both understand the function and understand the structure to do it

      Even for hobby projects, it's not great. I'm learning asyncio by writing a matrix scraper and writing good functional tests as you go is worth it to make sure you actually do understand the concepts

    • And what happens when these different objectives conflict or diverge ? Will it be able to figure out the appropriate trade-offs, live with the results and go meta to rethink the approach or simply delude itself ? We would definitely lose these skills if it continues like this.

> Unless you are only ever a single engineer, your career is filled with "I need to debug code I didn't write".

That's the vast majority of my job and I've yet to find a way to have LLMs not be almost but not entirely useless at helping me with it.

(also, it's filled with that even when you are a single engineer)

  • And even if you are the single engineer, I'll be honest, it might as well have been somebody else that wrote the code if I have to go back to something I did seven years ago and unearth wtf.

  • I hope you realize that means your position is in danger.

    • It would be in danger if LLMs could actually do that for me, but they're still very far from it and they progress slowly. One day I could start worrying, but it's not today.

    • Eh? If all the things that LLMs are not very good at (a long list), they are particularly not good at debugging.

It's nice that AI can fix bugs fast, but it's better to not even have bugs in the first place. By using someone else's battle tested code (like a framework) you can at least avoid the bugs they've already encountered and fixed.

  • I spent Dry January working on a new coding project and since all my nerd friends have been telling me to try to code with LLM's I gave it a shot and signed up to Google Gemini...

    All I can say is "holy shit, I'm a believer." I've probably got close to a year's worth of coding done in a month and a half.

    Busy work that would have taken me a day to look up, figure out, and write -- boring shit like matplotlib illustrations -- they are trivial now.

    Things that are ideas that I'm not sure how to implement "what are some different ways to do this weird thing" that I would have spend a week on trying to figure out a reasonable approach, no, it's basically got two or three decent ideas right away, even if they're not perfect. There was one vectorization approach I would have never thought of that I'm now using.

    Is the LLM wrong? Yes, all the damn time! Do I need to, you know, actually do a code review then I'm implementing ideas? Very much yes! Do I get into a back and forth battle with the LLM when it gets starts spitting out nonsense, shut the chat down, and start over with a newly primed window? Yes, about once every couple of days.

    It's still absolutely incredible. I've been a skeptic for a very long time. I studied philosophy, and the conceptions people have of language and Truth get completely garbled by an LLM that isn't really a mind that can think in the way we do. That said, holy shit it can do an absolute ton of busy work.

    • What kind of project / prompts - what’s working for you? /I spent a good 20 years in the software world but have been away doing other things professionally for couple years. Recently was in the same place as you, with a new project and wanting to try it out. So I start with a generic Django project in VSCode, use the agent mode, and… what a waste of time. The auto-complete suggestions it makes are frequently wrong, the actions it takes in response to my prompts tend to make a mess on the order of a junior developer. I keep trying to figure out what I’m doing wrong, as I’m prompting pretty simple concepts at it - if you know Django, imagine concepts like “add the foo module to settings.py” or “Run the check command and diagnose why the foo app isn’t registered correctly” Before you know it, it’s spiraling out of control with changes it thinks it is making, all of which are hallucinations.

      5 replies →

    • I'm (mostly) a believer too, and I think AI makes using and improving these existing frameworks and libraries even easier.

      You mentioned matplotlib, why does it make sense to pay for a bunch of AI agents to re-invent what matplotlib does and fix bugs that matplotlib has already fixed, instead of just having AI agents write code that uses it.

      1 reply →

  • Because frameworks don’t have bugs? Or unpredictable dependency interactions?

    This is generous, to the say the least.

    • Well maintained, popular frameworks have github issues that frequently get resolved with newly patched versions of the framework. Sometimes bugs get fixed that you didn't even run into yet so everybody benefits.

      Will your bespoke LLM code have that? Every issue will actually be an issue in production experienced by your customers, that will have to be identified (better have good logging and instrumentation), and fixed in your codebase.

    • Frameworks that are (relatively) buggy and slow to address bugs lose popularity, to the point that people will spontaneously create alternatives. This happened too many times.

  • > better to not have bugs in the first place

    you must have never worked on any software project ever

    • Have you? Then you know that the amount of defects scales linearly with the amount of code. As things stand models write a lot more code than a skilled human for a given requirement.

  • In practice using someone else’s framework means you’re accepting the risk of the thousands of bugs in the framework that have no relevance to your business use case and will never be fixed.

    • Yet people still use frameworks, before and after the age of LLMs. Frameworks must have done something right, I guess. Otherwise everyone will vibe their own little React in the codebase.

> Unless you are only ever a single engineer, your career is filled with "I need to debug code I didn't write".

True, but there's usually at least one person who knows that particular part of the system that you need to touch, and if there isn't, you'll spend a lot of time fixing that bug and become that person.

The bet you're describing is that the AI will be the expert, and if it can be that, why couldn't it also be the expert at understanding the users' needs so that no one is needed anywhere in the loop?

What I don't understand about a vision where AI is able to replace humans at some (complicated) part of the entire industrial stack is why does it stop at a particular point? What makes us think that it can replace programmers and architects - jobs that require a rather sophisticated combination of inductive and deductive reasoning - but not the PMs, managers, and even the users?

Steve Yegge recently wrote about an exponential growth in AI capabilities. But every exponential growth has to plateau at some point, and the problem with exponential growth is that if your prediction about when that plateau happens is off by a little, the value at that point could be different from your prediction by a lot (in either direction). That means that it's very hard to predict where we'll "end up" (i.e. where the plateau will be). The prediction that AI will be able to automate nearly all of the technical aspects of programming yet little beyond them seems as unlikely to me as any arbitrary point. It's at least as likely that we'll end up well below or well above that point.

  • > Steve Yegge recently wrote about an exponential growth in AI capabilities

    I'm not sure that the current growth rate is exponential, but the problem is that it doesn't need to be exponential. It should have been obvious the moment ChatGPT and stable diffusion-based systems were released that continued, linear progress of these models was going to cause massive disruption eventually (in a matter of years).

That doesn't change the fact that the submission is basically repeating the LISP curse. Best case scenario: you end up with a one-off framework and only you know how it works. The post you're replying to points out why this is a bad idea.

It doesn't matter if you don't use 90% of a framework as the submission bemoans. When everyone uses an identical API, but in different situations, you find lots of different problems that way. Your framework, and its users become a sort of BORG. When one of the framework users discovers a problem, it's fixed and propagated out before it can even be a problem for the rest of the BORG.

That's not true in your LISP curse, one off custom bespoke framework. You will repeat all the problems that all the other custom bespoke frameworks encountered. When they fixed their problem, they didn't fix it for you. You will find those problems over and over again. This is why free software dominates over proprietary software. The biggest problem in software is not writing the software, it's maintaining it. Free software shares the maintenance burden, so everyone can benefit. You bear the whole maintenance burden with your custom, one off vibe coded solutions.

I think back on the ten+ years I spent doing SRE consulting and the thing is, finding the problems and identifying solutions — the technical part of the work — was such a small part of the actual work. So often I would go to work with a client and discover that they often already knew the problem, they just didn’t believe it - my job was often about the psychology of the organization more than the technical knowledge. So you might say “Great, so the agent will automatically fix the problem that the organization previous misidentified.” That sounds great right up until it starts dreaming… it’s not to say there aren’t places for these agents, but I suspect ultimately it will be like any other technology we use where it becomes part of the toolkit, not the whole.

>I would argue that it's going to be the opposite. At re:Invent, one of the popular sessions was in creating a trio of SRE agents, one of which did nothing but read logs and report errors, one of which did analysis of the errors and triaged and proposed fixes, and one to do the work and submit PRs to your repo.

If you manage a code base this way at your company, sooner or later you will face a wall. What happens when the AI can't fix an important bug or is unable to add a very important feature? now you are stuck with a big fat dirty pile of code that no human can figure out because it wasn't coded by human and was never designed to be understood by a human in the first place.

  • I treat code quality, and readability, as one of the goals. The LLM can help with this and refactor code much quicker than a human. If I think the code is getting too complex I change over to architecture review and refactoring until I am happy with it.

  • What happens when humans can’t fix a bug or build an important feature? That is a pretty common scenario, that doesn’t result in the doomsday you imply.

    • There will always be bugs you can't fix, that doesn't mean we should embrace having orders of magnitude more of them. And it's not just about bugs, it's also about adding new features.

      This is tech debt on steroid. You are building an entire code base that no can read or understand and pray that the LLM won't fuck up too much. And when it does, no one in the company knows how to deal with it other than by throwing more LLM tokens at it and pray it works.

      As I said earlier, using pure AI agents will work for a while. But when it doesn't you are fucked.

Automatically solving software application bugs is one thing, recovering stateful business process disasters and data corruption is entirely another thing.

Customer A is in an totally unknown database state due to a vibe-coded bug. Great, the bug is fixed now, but you're still f-ed.

  •   > Automatically solving software application bugs
    

    the other issue is "fixing" false-positives; i've seen it before with some ai tools: they convince you its a bug and it looks legit and passes the tests but later on something doesn't quite work right anymore and you have to triage and revert it... it can be real time sink.

„understand how these systems actually function" isn't incompatible with "I didn't write most of this code“

Except now you have code you didn‘t write and patches you din‘t write either. Your „colleague“ also has no long term memory.

> one to do the work and submit PRs to your repo

Have we not seen loads of examples of terrible AI generated RPs every week on this site?

  • Because nobody posts the good ones. They're boring, correct, you merge them and move on to the next one. It's like there's a murder in the news every day but generally we're still all fine.

    Don't assume that when people make fun of some examples that there aren't thousands more that nobody cares to write about.

An AI thing demoed well? Well, colour me shocked.

Amazon, which employs many thousands of SREs (or, well, pseudo-SREs; AIUI it's not quite the conventional SRE role), is presumably just doing so for charitable purposes, if they are so easy to replace with magic robots.