Comment by jasonthorsness
5 months ago
Half of the work is specification and iteration. I think there’s a focus on full SWE replacement because it’s sensational, but we’ll more end up with SWE able to focus on the less patterned or ambiguous work and made way more productive with the LLM handling subtasks more efficiently. I don’t see how full SWE replacement can happen unless non-SWE people using LLMs become technical enough to get what they need out of them, in which case they probably have just become SWE anyway.
> unless non-SWE people using LLMs become technical enough to get what they need out of them
Non-SWE person here. In the past year I've been able to use LLMs to do several tasks for which I previously would have paid a freelancer on Fiverr.
The most complex one, done last spring, involved writing a Python program that I ran on Google Colab to grab the OCR transcriptions of dozens of 19th-century books off the Internet Archive, send the transcriptions to Gemini 1.5, and collect Gemini's five-paragraph summary of each book.
If I had posted the job to Fiverr, I would have been willing to pay several hundred dollars for it. Instead, I was able to do it all myself with no knowledge of Python or previous experience with Google Colab. All it cost was my subscription to ChatGPT Plus (which I would have had anyway) and a few dollars of API usage.
I didn't put any full-time SWEs out of work, but I did take one job away from a Fiverr freelancer.
> I didn't put any full-time SWEs out of work, but I did take one job away from a Fiverr freelancer.
I think this is the nuance most miss when they think about how AI models will displace work.
Most seem to think “if it can’t fully replace a SWE then it’s not going to happen”
When in reality, it starts by lowering the threshold for someone who’s technical but not a SWE, to jump in and do the work themselves. Or it makes the job of an existing engineer more efficient. Each hour less work needed spread across many tasks that would have otherwise gone to an engineer eventually sum up to a full time worth of an engineer. If it’s a Fiverr dev you eliminated the work of, that means the Fiverr dev will eventually go after the work that’s remaining, putting supply pressure on other devs
It’s the same mistake many had about self driving cars not happening because they couldn’t handle every road. No, they just need to start with 1 road, master that, and then keep expanding to more roads. Until they can do all of SF, and then more and more cities
Entirely possible. Have you got any numbers and real world examples? Growth? Profits? Actual quantified productivity gains?
The nuance your 'gotcha' scenario miss is that displacing fiverr, speeding up small side project, making scripts fo non-SWE, creating boilerplate, etc is not the trillions of dollars disruption that is needed by now.
This is a good anecdote but most software engineering is not scripting. It’s getting waist (or neck) deep in a large codebase and many intricacies.
That being said I’m very bullish on AI being able to handle more and more of this very soon. Cursor definitely does a great job giving us a taste of cross codebase understanding.
Seconded. Zed makes it trivial to provide entire codebases as context to Claude 3.5 Sonnet. That particular model has felt as good as a junior developer when given small, focused tasks. A year ago, I wouldn’t have imagined that my current use of LLMs was even possible.
1 reply →
> This is a good anecdote but most software engineering is not scripting. It’s getting waist (or neck) deep in a large codebase and many intricacies.
The agent I'm working on (RA.Aid) handles this by crawling and researching the codebase before doing any work. I ended up making the first version precisely because I was working on a larger monorepo project with lots of files, backend, api layer, app, etc.
So I think the LLMs can do it, but only if techniques are used to allow it to hone in on the specific information in a codebase that is relevant to a particular change.
If the goal is to get something to run correctly roughly once with some known data or input, then that's fine. Actual software development aims to run under 100% of circumstances, and LLMs are essentially cargo culting the development process and entrusting an automation that is unreliable to do mundane tasks. Sadly the quality of software will keep going down, perhaps even faster.
Stop with the realism, one off scripts is going to give trillions in ROI any day now. Personally could easily chip in maybe a million a month in subsbription fees be cause my bolierplate code I write once in a blue moon has speed up infinitely and I will cash out in profits any day now.
> I didn't put any full-time SWEs out of work, but I did take one job away from a Fiverr freelancer.
Who would use LLM anyway these days. Interesting when Fiverr will add non-human freelancers. Something similar to algorithmic traders. Passive income.
IOW LLMs make programming somewhat higher-level similar to what new programming languages in the past, either via code generation from natural language (main use-case right now?), or by interpreting a "program" written in natural language ("sum all the numbers in the 3rd column of this CSV").
The latter case enables more people to program to a certain extent, similar to what spreadsheets did, while we still need full SWEs in the first case, as you pointed out.
Everyone is a typist now, so I don't think it is farfetched that everyone is a SWE in the future.
Very few people are typist.
Most people can use a keyboard, but the majority of non-technical people type at a speed which is orders of magnitude less than a professional typist.
Another comment here mentions how they used colab while not being a SWE, but that is already miles ahead of what average people do with computers.
There's people who have used computers for decades and wouldn't be able to do a sum in a spreadsheet, nor know that is something spreadsheets can do.
What’s the WPM cutoff to be considered a typist?
5 replies →
If the llm can’t find me a solution in 3 to 5 tries while I improve the prompt I fall back to mire traditional methods and or use another model like Gemini.
> in which case they probably have just become SWE anyway
or learn to use something like Bubble
Yeah, I tried Copilot for the first time the other day and it seemed to be able to handle this approach fairly well -- I had to refine the details, but none of it was because of hallucinations or anything like that. I didn't give it a chance to try to handle the high-level objective, but based on past experience, it would have done something pointlessly overwrought at best.
Also, as an aside, re "not a real programmer" salt: If we suppose, as I've been led to believe, that the "true essence" of programming is the ability to granularize instructions and conceptualize data flow like this, and if LLMs remain unsuitable for coding tasks unless the user can do so, this would seem to undermine the idea that someone can only pretend to be a programmer if they use the LLMs.
Anyway, I used Copilot in VSCode to "Fix" this "code" (it advised me that I should "fix" my "code" by . . . implementing it, and then helpfully provided a complete example):