← Back to context

Comment by lemming

7 days ago

While this is true, I definitely find that the style of the work changes a lot. It becomes much more managerial, and less technical. I feel much more like a mix of project and people manager, but without the people. I feel like the jury is still out on whether I’m overall more productive, but I do feel like I have less fun.

My lessons so far:

1. Less fun.

2. A lot of more "review fatigue".

3. Tons of excess code I'd never put in there in the first place.

4. Frustration with agents being too optimistic which with time verges on the ludicurous ("Task #3 has been completed successfully with 98% tests failing. [:useless_emojis:]")

5. Frustration with agents routinely getting down a rabbit hole or going in circles, the effort needed to get that straight (Anthropic plainly advises to start from scratch in such cases - which is sound advice, but makes me feel like I just lost the last 5 hours of my life without even learning anything new).

I stopped using agents and use LLMs very sparingly (e.g. for review - they sometimes find some details I missed and occasionally have an interesting solution) but I'm enjoying my work so much more without them.

  • I think one of the tricks is to just stop using the agent as soon as you see signs of funny business. If it starts BSing me with failing tests, I just turn it off immediately and git reset (maybe after taking a quick peek)

    • Yeah I make maybe two or three attempts at getting it to write a plan that it is able to follow coherently. But after that I pull the escape hatch and *gasp* program by hand.

      I've made this mistake of doubling down after a few initial failures to solve an issue, by trying to make this super duper comprehensive and highly detailed and awesome plan that it will finally be able to implement correctly. But it just gets worse and worse the more I try, because it fundamentally is not understanding what is going on, so it will inevitably find an opportunity to go massively off rails, and the further down you lead it the more impressible the derailment will be.

      My experience is that going around in endless circles with the model is just a waste of time when you could have just done it yourself in the time you've wasted.

  • One thing I don’t get - If you spend much of your time reviewing, you’re just reading - you’re not actually doing anything - you’re passive in the activity of code production. By extension you will become worse at knowing what a good standard of code is and become worse at reviewing code.

    I’m not a SWE so I have no interests to protect by criticising what is going on.

    • In my DJing years I've learned that it is best to provide a hot signal and trim the volume than trying to amplify it later, because you end up amplifying noise. Max out the mixer volume and put a compressor (and a limiter after to protect the speaker set up - it will make it sound awful if hit, but it won't damage your set up and it will flag clueless bozos loud and clear) later, don't try to raise it after it leaves the origin.

      It seems to me that adding noise to the process and trying to cut it out later is a self defeating proposition. Or as Deming put it, (paraphrasing) you can't QC quality into a bad process.

      I can see how it seems better to "move fast and break things" but I will live and die by the opposite "move slow and fix things". There's much, much more to life than maximizing short term returns over a one dimensional naïve utilitarian take on value.

  • I reset context probably every 5-10 minutes if not more frequently, and commit even more often than that. If you’re going 5 hours between commits or context resets, I’m not surprised you’re getting bad results. If you ever see “summarizing”’in copilot for example, that means you went way too far in that context window. The LLMs get increasingly inaccurate and confused as the context window fills up.

    Other things like having it pull webpages in, will totally blow away your context. It’s better to make a separate context just to pull a webpage down and summarize it in markdown and then reset context.

    • The 'best' trick I learned from someone over here when working with Claude Code is to very regularly go back a few steps in your context (esc esc -> pick something a few steps up) and say something like "yeah, I already did this myself, now continue and do Y"

      It results helps keep the context clean while still keeping the initial context I provided (usually with documentation and initial plan setup) at the core of the context.

      Now that you say this, I did notice webpages blow context but didn't think too much of it just yet, maybe there's some improvement to be found here using a subagent? I'm not a big fan of subagents (didn't really get proper results out of them in my initial experiments anyway) but maybe adding a 'web researcher' sub agent that summarizes to a concise markdown file could help here.

      2 replies →

  • Regarding #3. I feel it's related to this idea: We can build a wood frame house with 2x4's or toothpicks. AI directed and generated code today tends to build things overly complex with more pieces than necessary. I feel like an angry foreman yelling at AI to fix this, change that, etc. I feel I spend more time and energy supervising AI while getting a sloppier end result.

    • Thankfully, yelling like an angry foreman is more effective on LLMs than people.

      > Get your fucking act together, and stop with the bullshit comments, shipping unfinished code, and towering mess of abstractions. I've seen you code properly before. You're an expert for God's sake. One more mistake, and you're fired. Fix it, now!

      2 replies →

Yeah exactly, it changes the job from programmer to (technical) project manager, which is both more proactive (writing specifications) and reactive (responding to an agent finishing). The 'sprinting' remark is apt, because if your agents are not working you need to act. And it's already established that a manager shouldn't micromanage, that'll lead to burnout and the like. But that's why software engineers will remain relevant, because managers need someone to rely on that can handle the nitty-gritty details of what they ask for.

I also think that managing a coding agent isnt like managing a person. a person is creative, they will come up with ways that challenge whatever idea you have and that usually makes the project better. A coding agent never challenges you, mostly just does whatever you want, and you don't end up having any kind of intellectual person to person engagement that is why working on teams can be fun. So it kind of isolates you. And I think the primary reason all this happens is because marketing people have decided to call all of these coding agents "Artificial Intelligence" instead of "Dev Tools". And instead of calling it "Security" they now call it "AI Alignment". And instead of calling it "data schema" or "Spec sheet" they call it "managing the AI context". So now, we are all biased to see these things as some kind of entity that we can treat something like a colleague and we all bought this idea because the tool can chat with you. But it isn't a colleague, it doesn't think and feel, it doesn't provide intellectual engagement, it simply is a lossy, noisy tool to try and translate human language into computer language whether its python or machine code.

  • Have you used SOTA models to code in the last 2 months or so? This reads like someone who has given up a year ago and made their impressions based off GPT-3.

    AI can absolutely have creativity. You just have to engage it like that. The article itself talks about that. You don’t just say “hey AI, go write this code.” You write a spec along with the AI. You tell it what parts are open to its interpretation. Tell it if you want it to be creative or to follow common practices. What level of abstraction is appropriate, etc.

    If all you do is give it directions then it just follows the directions.

    Also context doesn’t have much to do with a data schema. It’s more like a document database with no schema, if anything. It’s a collection of tokens that it refers back to. Schema implies some structured data with semantic meaning and hierarchies or relationships. That might exist as an emergent property, but for example if I just had a folder full of PDFs, I wouldn’t consider that a schema. That’s kinda what context is like.