Thesis: Interesting work is less amenable to the use of AI

8 days ago (remark.ing)

LLM's can't really reason, in my opinion (and in a lot of researchers), so, being a little harsh here but given that I'm pretty sure these things are trained on vast swaths of open source software I generally feel like what things like Cursor are doing can be best described as "fancy automated plagiarism". If the stuff you're doing can be plagiarized from another source and adapted to your own context, then LLM's are pretty useful (and that does describe a LOT of work), although it feels like a little bit of a grey area to me ethically. I mean, the good thing about using a library or a plain old google search or whatnot is you can give credit, or at least know that the author is happy with you not giving credit. Whereas with whatever Claude or ChatGPT is spitting out, I mean, I'm sure you're not going to get in trouble for it but part of me feels like it's in a really weird area ethically. (especially if it's being used to replace jobs)

Anyway, in terms of "interesting" work, if you can't copy it from somewhere else than I don't think LLMs are that helpful, personally. I mean they can still give you small building blocks but you can't really prompt it to make the thing.

  • What I find a bit annoying is that if you sit in the llm you never get an intuition about the docs because you are always asking the llm. Which is nice in some cases but it prevents discovery in other cases. There’s plenty of moments where I’m reading docs and learn something new about what some library does or get surprised it lacks a certain feature. Although the same is true for talking to an llm about it. The truth is that I don’t think we really have a good idea of the best kind of human interface for LLMs as a computer access tool.

    • FWIW, I've had ChatGPT suggest things I wasn't aware of. For example, I asked for the cleanest implementation for an ordered task list using SQLAlchemy entities. It gave me an implementation but then suggested I use a feature SQLAlchemy already had built in for this exact use case.

      SQLAlchemy docs are vast and detailed, it's not surprising I didn't know about the feature even though I've spent plenty of time in those docs.

A Danish audio newspaper host / podcaster had the exact apposite conclusion when he used ChatGPT to write the manuscript for one his episodes. He ended up spending as much time as he usually does because he had to fact check everything that the LLM came up with. Spoiler: It made up a lot of stuff despite it being very clear in the prompt, that it should not do so. To him, it was the most fun part, that is writing the manuscript, that the chatbot could help him with. His conclusion about artificial intelligence was this:

“We thought we were getting an accountant, but we got a poet.”

Frederik Kulager: Jeg fik ChatGPT til at skrive dette afsnit, og testede, om min chefredaktør ville opdage det. https://open.spotify.com/episode/22HBze1k55lFnnsLtRlEu1?si=h...

  • > It made up a lot of stuff despite it being very clear in the prompt, that it should not do so.

    LLMs are not sentient. They are designed to make stuff up based on probability.

    • I love this turn of phrase. It quite nicely evokes the difference between how the reader thinks vs how the LLM does.

      It also invites reflections on what “sentience” means. In my experience — make of it what you will — correct fact retrieval isn’t really necessary or sufficient for there to be a lived, first-person experience.

    • Making stuff up is not actually an issue. What matters is how you present it. If I was less sure about this I would write: Making stuff up might not be an issue. It could be that how you present it is more important. Even less sure: Perhaps it would help if it didn't sound equally confident about everything?

    • Why would sentience be required for logically sound reasoning (or the reverse, for that matter)?

  • It's not the exact opposite*, the author said that if you're doing boilerplate _code_ it's probably fine.

    The thing is that since it can't think, it's absolutely useless when it comes to things that hasn't been done before, because if you are creating something new, the software won't have had any chance to train on what you are doing.

    So if you are in a situation in which it is a good idea to create a new DSL for your problem **, then the autocruise control magic won't work because it's a new language.

    Now if you're just mashing out propaganda like some brainwashed soviet apparatchik propagandist, maybe it helps. So maybe people who writes predictable slop like this the guardian article (https://archive.is/6hrKo) would be really grateful that their computer has a cruise control for their political spam.

    ) if that's what you meant *) which you statistically speaking might not want to do, but this is about actually interesting work where it's more likely to happen*

    • In a world where the AI can understand your function library near flawlessly and compose it in to all sorts of things, why would you put the effort into a DSL that humans will have to learn and the AI will trip over? This is a dead pattern.

      2 replies →

  • As a writer I find his take appalling and incomprehensible. So, apparently not all writers agree that writing with AI is fun. To me, it’s a sickening violation of integrity.

    • Yeah, if I were their reader, I'd most likely never read anything from them again, since nothing's stopping them from doing away with integrity altogether and just stitching together a bunch of scripts ('agents') into an LLM slop pipeline.

      It's so weird how people use LLMs to automate the most important and rewarding parts of the creative process. I get that companies have no clue how to market the things, but it really shows a lack of imagination and self-awareness when a 'creative' repackages slop for their audience and calls it 'fun'.

I have gotten much more value out of AI tools by focusing on the process and not the product. By this I mean that I treat it as a loosely-defined brainstorming tool that expands my “zone of knowledge”, and not as a way to create some particular thing.

In this way, I am infinitely more tolerant of minor problems in the output, because I’m not using the tool to create a specific output, I’m using it to enhance the thing I’m making myself.

To be more concrete: let’s say I’m writing a book about a novel philosophical concept. I don’t use the AI to actually write the book itself, but to research thinkers/works that are similar, critique my arguments, make suggestions on topics to cover, etc. It functions more as a researcher and editor, not a writer – and in that sense it is extremely useful.

  • I think it's a U-shaped utility curve where abstract planning is on one side (your comment) and the chore implementation is on the other.

    Your role is between the two: deciding on the architecture, writing the top-level types, deciding on the concrete system design.

    And then AI tools help you zoom in and glue things together in an easily verifiable way.

    I suspect that people who still haven't figured out how to make use of LLMs, assuming it's not just resentful performative complaining which it probably is, are expecting it to do it all. Which never seemed very engineer-minded.

  • Agree - I tend to think of it as offloading thinking time. Delegating work to an agent just becomes more work for me, with the quality I've seen. But conversations where I control the context are both fun and generally insightful, even if I decide the initial idea isn't a good one.

    • That is a good metaphor. I frequently use ChatGPT in a way that basically boils down to: I could spend an hour thinking about and researching X basic thing I know little about, or I could have the AI write me a summary that is 95% good enough but only takes a few seconds of my time.

My thesis is actually simpler. For the longest time until the Industrial Revolution humans have done uninteresting work for the large part. There was a routine and little else. Intellectuals worked through a very terse knowledge base and it was handed down master to apprentice. Post renaissance and industrial age the amount of known knowledge has exploded, the specializations have exploded. Most of what white collar work is today is managing and searching through this explosion of knowledge and rules. AI (well the LLM part) is mostly targeted towards that - making that automated. That’s all it is. Here is the problem though, it’s for the clueless. Those who are truly clueless fall victim to the hallucinations. Those who have expertise in their field will be able to be more efficient.

AI isn’t replacing innovation or original thought. It is just working off an existing body of knowledge.

  • I disagree that ancient work was uninteresting. If you've ever looked at truly old architecture, walls, carvings etc you can see that people really took pride in their work, adding things that absolutely weren't just pure utility. In my mind that's the sign of someone that considers their work interesting.

    But in general, in the past there was much less specialization. That means each individual was responsible for a lot more stuff, and likely had a lot more varied work day. The apprentice blacksmith didn't just hammer out nail after nail all day with no breaks. They made all sorts of tools, cutlery, horseshoes. But they also carried water, operated bellows, went to fetch coke etc, sometimes even spending days without actually hammering metal at all - freeing up mental energy and separation to be able to enjoy it when they actually got to do it.

    Similarly, farm laborers had massively varied lives. Their daily tasks of a given week or month would look totally different depending on the season, with winter essentially being time off to go fix or make other stuff because you can't do much more than wait to make plants grow faster

    People might make the criticism and say "oh but that was only for rich people/government" etc, but look at for example old street lights, bollards etc. Old works tend to be

    Specialization allows us to curse ourselves with efficiency, and a curse it is indeed. Now if you're good at hammering nails, nails are all you'll get, morning to night, and rewarded the shittier and cheaper and faster you make your nails, sucking all incentive to do any more than the minimum

  • > Those who have expertise in their field will be able to be more efficient.

    My problem with it as a scientist is that I can't trust a word it writes until I've checked everything 10 times over. Checking over everything was always the hardest part of my job. Subtle inconsistencies can lead to embarrassing retractions or worse. So the easy part is now automatic, and the hard part is 10x harder, because it will introduce mistakes in ways I wouldn't normally do, and therefore it's like I've got somebody working against me the whole time.

    • Yes, this is exactly how I feel about AI generating code as well.

      Reviewing code is way harder than writing it, for me. Building a mental model of what I want to build, then building that comes very naturally to me, but building a mental model of what someone else made is much more difficult and slow for me

      Feeling like it is working against me instead of with me is exactly the right way to describe it

  • Hunter–gatherers have incredible knowledge and awareness about their local environment – local flora and fauna, survival skills, making and fixing shelters by hand, carpentry, pottery, hunting, cooking, childcare, traditional medicine, stories transmitted orally, singing or music played on relatively simple instruments, hand-to-hand combat, and so on – but live in relatively small groups and are necessarily generalists. The rise of agriculture and later writing made most people into peasant farmers, typically disempowered if not enslaved (still with a wide range of skills and deep knowledge), and led to increasing specialization (scribes, artisans, merchants, professional soldiers, etc.).

    Calling this various work "uninteresting" mostly reflects on your preferences rather than the folks who were doing the work. A lot of the work was repetitive, but the same is true of most jobs today. That didn't stop many people from thinking about something else while they worked.

  • I would say that mastering things like building, farming, gardening, hunting, blacksmithing and cooking does require quite a bit of learning. Before industrial revolution most people engaged in many or all of those activities, and I believe they were more intellectually stimulated than your average office worker today.

The one thing AI is good at is building greenfield projects from scratch using established tools. If want you want to accomplish can be done by a moderately capable coder with some time reading the documentation for the various frameworks involved, then I view AI as fairly similar to the scaffolding that happened with Ruby on Rails back in the day when I typed "rails new myproject".

So LLMs are awesome if I want to say "create a dashboard in Next.js and whatever visualization library you think is appropriate that will hit these endpoints [dumping some API specs in there] and display the results to a non-technical user", along with some other context here and there, and get a working first pass to hack on.

When they are not awesome is if I am working on adding a map visualization to that dashboard a year or two later, and then I need to talk to the team that handles some of the API endpoints to discuss how to feed me the map data. Then I need to figure out how to handle large map pin datasets. Oh, and the map shows regions of activity that were clustered with DBSCAN, so I need to know that Alpha shape will provide a generalization of a convex hull that will allow me to perfectly visualize the cluster regions from DBSCAN's epsilon parameter with the corresponding choice of alpha parameter. Etc, etc, etc.

I very rarely write code for greenfield projects these days, sadly. I can see how startup founders are head over heels over this stuff because that's what their founding engineers are doing, and LLMs let them get it cranking very very fast. You just have to hope that they are prudent enough to review and tweak what's written so that you're not saddled with tech debt. And when inevitable tech debt needs paying (or working around) later, you have to hope that said founders aren't forcing their engineers to keep using LLMs for decisions that could cut across many different teams and systems.

  • I get what point you're trying to make, and agree, but you've picked a bad example.

    That boilerplate heavy, skill-less, frontend stuff like configuring a map control with something like react-leaflet seems to be precisely what AI is good at.

    • Yeah it will make a map and plot some stuff on it. It might do well at handling 20 millions pins on the map gracefully even. I doubt it's gonna know to use alpha shapes to complement DBSCAN quite so gracefully.

      edit: Just spot checked it and it thinks it's a good idea to use convex hulls.

  • I got the feeling for your cross-team use case is that tech leaders have a dream of each team exposing their own tuned MCP agent and your agents will talk to each other.

    That idea reminds me of "DevOps is to automate fail". Perhaps: "agent collaboration is to automate chaos"

There's a hundred ways to use AI for any given work. For example if you are doing interesting work and aren't using AI-assisted research tools (e.g., OpenAI Deep Research) then you are missing out on making the work that more interesting by understanding the context and history of the subject or adjacent subjects.

This thesis only makes sense if the work is somehow interesting and you also have no desire to extend, expand, or enrich the work. That's not a plausible position.

  • > This thesis only makes sense if the work is somehow interesting and you also have no desire to extend, expand, or enrich the work. That's not a plausible position.

    Or your interesting work wasn't appearing in training set often enough. Currently I am writing a compiler and runtime for some niche modeling language, and every model I poke for help was rather useless except some obvious things I already know.

    • Some things you could do:

      1. Look up compiler research in relevant areas

      2. Investigate different parsing or compilation strategies

      3. Describe enough of the language to produce or expand test cases

      4. Use the AI to create tools to visualize or understand the domain or compiler output

      5. Discuss architectural approaches with the AI (this might be like rubber duck architecting, but I find that helpful just like rubber duck debugging is helpful)

      The more core or essential a piece of code is, the less likely I am to lean on AI to produce that piece of code. But that's just one use of AI.

If AI can do the easiest 50% of our tasks, then it means we will end up spending all of our time on what we previously considered to be the most difficult 50% of tasks. This has a lot of implications, but it does generally result in the job being more interesting overall.

  • Or, alternatively, the difficult 50% are difficult because they're uninteresting, like trying to find an obscure workaround for an unfixed bug in excel, or re-authing for the n-th time today, or updating a Jira ticket, or getting the only person with access to a database to send you a dataset when they never as much as reply to your emails...

  • > we will end up spending all of our time on what we previously considered to be the most difficult 50% of tasks

    Either that, or replacing the time with slacking off and not even getting whatever benefits doing the easiest tasks might have had (learning, the feeling of accomplishing something), like what some teachers see with writing essays in schools and homework.

    The tech has the potential to let us do less busywork (which is great, even regular codegen for boilerplate and ORM mappings etc. can save time), it's just that it might take conscious effort not to be lazy with this freed up time.

    • The industry has already gone through many, many examples of software reducing developer effort. It always results in developers becoming more productive.

  • In my experience, the 50% most difficult part of a problem is often the most boring. E.g. writing tests, tracking down obscure bugs, trying to understand API or library documentation, etc. It's often stuff that is very difficult but doesn't take all that much creativity.

    • I disagree with all of those. Tracking down obscure bugs is interesting, and all the other examples are easy.

  • You'll potentially be building on flimsy foundations if it gets the foundational stuff wrong (see anecdote in sibling post). I fear for those who aren't so diligent, especially if there are consequences involved.

    • The strategy is to have it write tests, and spend your time making sure the tests are really comprehensive and correct, then mostly just trust the code. If stuff breaks down the line, add regression tests, fix the problem and continue with your day.

  • >This has a lot of implications, but it does generally result in the job being more interesting overall.

    One implication is that when AI providers claim that "AI can make a person TWICE as productive!"

    ... business owners seem to be hearing that as "Those users should cost me HALF as much!"

  • > If AI can do the easiest 50% of our tasks

    ...But it can't, which means your inference has no implications, because it evaluates to False.

I have found it fascinating how AI has forced me to reflect on what I actually do at work and whether it has value or not.

  • Those kinds of thought processes are the kinds that produce value.

    Deciding what to build and how to build it is often harder than building.

    What LLMs of today do is basically super-autocomplete. It's a continuation of the history of programming automation: compilers, more advanced compilers, IDEs, code generators, LINTers, autocomplete, codeinsight, etc.

The one thing LLM cannot do currently is read the room. Even if it contains all existing information and can create any requested admixture from its training, that admixture space is infinite. Therefore the curators role is in creating with it the most interesting output. The more nuanced and sophisticated the interesting work, the more role there is for this curation.

I kind of use it that way. The LLM is walking a few feet in front of me, quickly ideating possible paths, allowing me to experiment more quickly. Ultimately I am the decider of what matters.

This reminds me a bit of photography. A photographer will take a lot of pictures. They try a lot of paths. Most of the paths don't actually work out. What you see of their body of work is the paths that worked, that they selected.

Thesis: Using the word “thesis” is a great way to disguise a whiny op-ed as the writings of a learned soul

> interesting work (i.e., work worth doing)

Let me guess, the work you do is interesting work (i.e., work worth doing) and the work other people do is uninteresting work (i.e., work not worth doing).

Funny how that always happens!

I feel much more confident that I can take on a project in a domain that im not very familiar with. Ive been digging into llvm ir and I had not prior experience with it. ChatGPT is a much better guide to getting started than the documentation, which is very low quality.

  • Good luck with that.

    I have been exploring local AI tools for coding (ollama + aider) with a small stock market simulator (~200 lines of python).

    First I tried making the AI extract the dataclasses representing events to a separated file. It decided to extract some extra classes, leave behind some others, and delete parts of the code.

    Then I tried to make it explain one of the actors called LongVol_player_v1, around 15 lines of code. It successfully concluded it does options delta hedging, but it jumped to the conclusion that it calculates the implied volatility. I set it as a constant, because I'm simulating specific interactions between volatility players and option dealers. It hasn't caught yet the bug where the vol player buys 3000 options but accounts only for 2000.

    When asking for improvements, it is obsessed with splitting the initialization and the execution.

    So far I wasted half of Saturday trying to make the machine do simple refactors. Refactors I could do myself in half of an hour.

    I'm yet to see the wonders of AI.

    • If you are using Ollama that suggests you are using local models - which ones?

      My experience is that the hosted frontier models (o3, Gemini 2.5, Claude 4) would handle those problems with ease.

      Local models that fit on a laptop are a lot less capable, sadly.

      2 replies →

    • Could you link the repo and prompts? What you described seems like the type of thing I’ve done before with no issue so you may have an interesting code base that is presenting some issues for the LM.

      3 replies →

    • For what it's worth, commercial models are in a completely different league to locally runnable models. If you are really interested in seeing state of the art right now at least give it a whack with opus/gemini/o3 or something of that calibre.

      You might still be disappointed but at least you won't have shot your leg off out of the gates!

      2 replies →

The vast majority of any interesting project is boilerplate. There's a small kernel of interesting 'business logic'/novel algorithm/whatever buried in a sea of CRUD: user account creation, subscription management, password resets, sending emails, whatever.

  • Yes so why would you spend tons of time and introduce a huge amount of technical debt by rewriting the boring parts, instead of just using a ready made off the shelf solution in that case.

    You'd think that there be someone who'd be nice enough to create a library or a framework or something that's well documented and is popular enough to get support and updates. Maybe you should consider offloading the boring part to such a project, maybe even pay someone to do it?

    • That was a solved problem in the 00's with the advent of Rails, or so i thought. Then came the JS framework craze and everything needed to be reinvented. Not just that, but frameworks which had all these battle-tested boring parts were not trendy anymore. Micro framworks became the new default and idiots after idiots jumped on that bandwagon only to reimplement everything from scratch because almost any app will grow to a point where it will need authn, user mgmt, mail, groups and so on...

  • This depends entirely on the type of programming you do. If all you build is CRUD apps then sure. Personally I’ve never actually made any of those things — with or without AI

    • You are both right. B2B for instance is mostly fairly template stuff built from CRUD and some business rules. Even some of the more perceived as 'creative' niches such as music scoring or 3D games are fairly route interactions with some 'engine'.

      And I'm not even sure these 'template adjacent' regurgitations are what the crude LLM is best at, as the output needs to pass some rigorous inflexible test to 'pass'. Hallucinating some non-existing function in an API will be a hard fail.

      LLM's have a far easier time in domains where failures are 'soft'. This is why 'Elisa' passed as a therapist in the 60's, long before auto-programmers were a thing.

      Also, in 'academic' research, LLM use has reached nearly 100%, not just for embelishing writeups to the expected 20 pages, but in each stage of the'game' including 'ideation'.

      And if as a CIO you believe that your prohibition on using LLMs for coding because of 'divulging company secrets' holds, you are either strip searching your employees on the way in and out, or wilfully blind.

      I'm not saing 'nobody' exists that is not using AI in anything created on a computer, just like some woodworker still handcrafts exclusive bespoke furniture in a time of presses, glue and CNC, but adoption is skyrocketing and not just because the C-suite pressures their serves into using the shiny new toy.

      3 replies →

  • Most places I worked the setting up of that kind of boilerplate was done a long time ago. Yes it needs maintaining and extending. But rarely building from the ground up.

> Meanwhile, I feel like if I tried to offload my work to an LLM, I would both lose context and be violating the do-one-thing-and-do-it-well principle I half-heartedly try to live by.

He should use it as a Stack Overflow on steroids. I assume he uses Stack Overflow without remorse.

I used to have 1y streaks on being on SO, now I'm there around once or twice per week.

While I didn't agree with the "junior developer" analogy in the past, I am finding that it is beginning to be a bit more like that. The new Codex tool from OpenAI feels a lot more like this. It seems to work best if you already have a few examples of something that you want to do and now want to add another. My tactic is to spell it out very clearly in the prompt and really focus on having it consistently implement another similar thing with a narrow scope. Because it takes quite a while, I will usually just fix any issues myself as opposed to asking it to fix them. I'm still experimenting but I think a well crafted spec / AGENTS.md file begins to become quite important. For me, this + regular ChatGPT interactions are much more valuable than synchronous / Windsurf / Cursor style usage. I'd prefer to review a more meaningful PR than a million little diffs synchronously.

  • if you havent tried yet, get it to ask you clarifying questions to make the requirements unambiguous.

    and ask it to write a design doc, and to write a work plan of different prompts to implement the change

I don't have LLM/AI write or generate any code or document for me. Partly because the quality is not good enough, and partly I worry about copyright/licensing/academic rigor, partly because I worry about losing my own edge.

But I do use LLM/AI, as a rubber duck that talks back, as a google on steroids - but one who needs his work double checked. And as domain discovery tool when quickly trying to get a grasp of a new area.

Its just another tool in the toolbox for me. But the toolbox is like a box of chocolates - you never know what you are going to get.

  • In the new world that's emerging, you are losing your edge by not learning how to master and leverage AI agents. Quality not good enough? Instruct them in how you want them to code, and make sure a sufficient quantity of the codebase is loaded into their context so they can see examples of what you consider good enough.

    • >Instruct them in how you want them to code

      They don't always listen.

      Writing SQL, I'll give ChatGPT the schema for 5 different tables. It habitually generates solutions with columns that don't exist. So, naturally, I append, "By the way, TableA has no column FieldB." Then it just imagines a different one. Or, I'll say, "Do not generate a solution with any table-col pair not provided above." It doesn't listen to that at all.

      3 replies →

I am 100% sure that horse-breeders and carriage-decorators also had very high interest in their work and craft.

Here we go again.

But. "Interesting" is subjective, and there's no good definition for "intelligence", AI has so much associated hype. So we could debate endlessly on HN.

Supposing "interesting" means something like coming up with a new Fast Fourier Transform algorithm. I seriously doubt an LLM could do something there. OTOH AI did do new stuff with protein folding.

So, we can keep debating I guess.

[flagged]

It's definitely real that a lot of smart productive people don't get good results when they use AI to write software.

It's also definitely real that a lot of other smart productive people are more productive when they use it.

These sort of articles and comments here seem to be saying I'm proof it can't be done. When really there's enough proof it can be that you're just proving you'll be left behind.

  • >you're just proving you'll be left behind.

    ... said every grifter ever since the beginning of time.