← Back to context

Comment by momojo

19 hours ago

This reminds me of Antirez's "Don't fall into the anti-AI hype" [0]

In a sentence: These foundation models are really good at optimizing these extremely high level, extremely well defined problem spaces (ie multiply matrices faster). In Antirez's case, it's "make Redis faster".

There have been two reactions: "Oh it would never work for me" and "I have seen months of my life accomplished in an hour", and I think they're both right. I think we should be excited for Antirez, (who has since been popping off [1]), and I think the rest of us should rest easy knowing that LLM's can't (and maybe were never meant to) tackle the tacit-knowledge-filled, human-system-centric, ambiguously-defined-problem-space jobs most mortals work.

[0] https://antirez.com/news/158 [1] https://antirez.com/news/164

>I think the rest of us should rest easy knowing that LLM's can't (and maybe were never meant to) tackle the tacit-knowledge-filled, human-system-centric, ambiguously-defined-problem-space jobs most mortals work

I don't believe that anymore, to be honest. Models are starting to get good at ambiguity. Claude Code now asks me when something is ambiguous. Soon, all meetings will be recorded, transcribed and stored in a well-indexed place for the agents to search when faced with ambiguity (free startup idea here!). If they can ask you now, they'll be able to search for the answers themselves once that's possible. In fact, they already do it now if you have a well-documented Notion/Confluence, it's just that nobody has.

It's probably harder to RL for "identify ambiguity" than RL'ing for performance algorithms, sure, but it's not impossible and it's in the works. It's just a matter of time now.

  • > Models are starting to get good at ambiguity

    That's fair, and something I've observed too. I wish I had written "the rest of us shouldn't freak out and quit software today".

    But here's another data point: At the biotech I work for, writing good code has never been the bottleneck. I actually told my boss that a paid Claude vs free subscription wouldn't be that much value because even if it took every piece of code or algorithm we've ever written and 10x-ed the hell out of them, we'd still be bottlenecked by the biology and physics which dictates that we wait 24 days for our histology assay pipeline.

    I have a hunch most fields outside of software are this way. And I'm personally not planning to quit anytime soon.

  • > Soon, all meetings will be recorded, transcribed and stored in a well-indexed place for the agents to search when faced with ambiguity (free startup idea here!)

    We were doing that over at Vowel a few years back, unfortunately it didn't pan out because you're competing directly against Zoom, Google Meet, Microsoft Teams, etc. They are all (slowly) catching up to where we were as a scrappy startup 4 years ago.

    It was truly game-changing to have all of your meetings in an easily searchable database. Even as a human.

  • Tacit knowledge is definitionally not recorded in any of these systems. This proposes to solve the problem of tacit knowledge by getting rid of it. It is not clear to me if that solution is either possible or desirable.

    • The labs are spending hundreds of millions of dollars hiring people doing many fairly random (but economically valuable) tasks to collect this tacit knowledge for RL.

      It works really well.

      2 replies →

  • Why record when it can build in realtime as meeting is going on.

    Slack is kinda there with Salesforce - can do a lot already on Agentforce and in Slackbot, but two aren't integrated just yet and Slackbot doesn't support group chats/channels. One interesting aspect in this will be - who has superiority boss, client, analyst or developer?

  • In coding the ambiguity is very, very limited and constrained compared to any non dev job that involves any decision making

    • That's.. not even close to being the case. It's literally a series of ambiguous questions and strategic decisions.

      Non-ambiguous is like a first semester algorithms class in university.

  • Unfortunately you can't record meetings in many jurisdictions, including court sessions. Hence we have to rely - for worse, or perhaps even for better - on human driven note taking.

    • You're downplaying the AI lobby here. They're eating down copyright laws, something that seemed impossible just a couple of years ago. Screwing privacy laws is just the next step.

      Also, we are seeing a cultural shift around that as well. Now people bring "AI notetakers" to Zoom calls without even asking for your permission. People are already acting like privacy laws don't exist anymore, it's going to be even easier for the AI lobby to take it down now. Just like piracy normalized copyright infringement, opening the path to the current rulings around "fair training".

      4 replies →

I have found Claude et al good at quickly implementing the algorithm I have in mind effectively, as long as I ask lots of control questions and check code. They aren’t good at inventing non-mainstream algorithms though and often slip staggeringly short term shortcuts in though. They are still a tool and not yet the craftsman who wields tools effectively. This will steadily change, and the corners where the obscure algorithm wins will erode further too.

> I think the rest of us should rest easy knowing that LLM's can't [...]

What if (when?) (AI-assisted) research moves AI beyond LLMs? Do you think that can't happen?

  • Not in the next decade. Won't get funded.

    • Advanced Machine Intelligence (AMI), a new Paris-based startup cofounded by Meta’s former chief AI scientist Yann LeCun, announced Monday it has raised more than $1 billion to develop AI world models.

      LeCun argues that most human reasoning is grounded in the physical world, not language, and that AI world models are necessary to develop true human-level intelligence. “The idea that you’re going to extend the capabilities of LLMs [large language models] to the point that they’re going to have human-level intelligence is complete nonsense,” he said. [0]

      [0] https://www.wired.com/story/yann-lecun-raises-dollar1-billio...

      6 replies →

  • I mean, Google already has Mu Zero, which Im willing to bet has evolved quite a bit in private because if anything is going to get us closer to actual AI its that.

    Realistically, one can build a AI capable of reasoning (i.e recurrent loops with branches) using very basic models that fit on a 3090, with multi agent configuration along the lines https://github.com/gastownhall/gastown. Nobody has done it yet because we don't know what the number of agents is required and what the prompts for those look like.

    The fundamental philosophical problem is if that configuration is possible to arrive at using training, or do ai agents have to go through equivalent "evolution epocs" to be able to do all that in a simulated environment. Because in the case of those prompts and models, they have to be information agnostic.

I'd say it's a malefactor of:

1. Amazing, you just tweaked 1% efficiency

2. You idiot, you just spent an hour trying to trouble shoot a hallucinated api.

On average, it's really hard to tell which ones going to win here.

  • Its not hard to tell at all, just look at how much it costs to run a 10T param model (especially with parallelized agents). Those costs are not worth the occasional slot machine-eque jackpot you get. For an entity like Google it might be worth it, but that's it. They definitely aren't going to let us use these things for cost they are now for much longer.

    Imagine going back to 2020 and tell people in 6 years going to be able to spend $200.00 a month and be able to spin up $2mm in GPUs at full throttle to respond to your emails. None of this makes sense.

    • You don't pay for a £200 a month account to respond to your emails, and if you are, I would tell you that you're wasting your money.

      1 reply →

    • Whenever you solve any hard problem, you start off by finding a complicated solution, which you then scale down to a simpler solution.

      LLMs are a "complicated solution" in the sense that they're expensive. Once you know what they're capable of, you can scale them down to something less expensive. There's usually a way.

      Also, an important advantage of LLMs over other approaches is that it's easy to improve them by finding better ways of prompting them. Those prompting strategies can then get hard-coded into the models to make them more efficient. Rinse and repeat. Similarly, you can produce curated data to make them better in certain areas like programming or mathematics.

      1 reply →

>I think the rest of us should rest easy knowing that LLM's can't (and maybe were never meant to) tackle the tacit-knowledge-filled, human-system-centric, ambiguously-defined-problem-space jobs most mortals work.

A Statement all but guaranteed to look incredibly short sighted by 2030.

  • The past few years has seen a great rise in casuals reminding us of AIs limitations only to be proven wrong in 6 months. I don't think we're close to AGI, but in 2 years I've gone from AI doubter to AI convert. It's not perfect, but I don't need it to be.

    The real question to me is if the system can pay for itself. Economics are racing against efficiency gains and it's anyone's guess which wins.

    • what are those limitations we're talking about? seems most of those the original limitations that people complained about were resolved through workarounds like tools and skills which are more software-engineering than llm advancement.