Comment by halfcat

19 days ago

So AI makes it cheaper to remix anything already-seen, or anything with a stable pattern, if you’re willing to throw enough resources at it.

AI makes it cheap (eventually almost free) to traverse the already-discovered and reach the edge of uncharted territory. If we think of a sphere, where we start at the center, and the surface is the edge of uncharted territory, then AI lets you move instantly to the surface.

If anything solved becomes cheap to re-instantiate, does R&D reach a point where it can’t ever pay off? Why would one pay for the long-researched thing when they can get it for free tomorrow? There will be some value in having it today, just like having knowledge about a stock today is more valuable than the same knowledge learned tomorrow. But does value itself go away for anything digital, and only remain for anything non-copyable?

The volume of a sphere grows faster than the surface area. But if traversing the interior is instant and frictionless, what does that imply?

> The volume of a sphere grows faster than the surface area. But if traversing the interior is instant and frictionless, what does that imply?

It's nearly frictionless, not frictionless because someone has to use the output (or at least verify it works). Also, why do you think the "shape" of the knowledge is spherical? I don't assume to know the shape but whatever it is, it has to be a fractal-like, branching, repeating pattern.

The fundamental idea that modern LLMs can only ever remix, even if its technically true (doubt), in my opinion only says to me that all knowledge is only ever a remix, perhaps even mathematically so. Anyone who still keeps implying these are statistical parrots or whatever is just going to regret these decisions in the future.

  • Why doubt? Transformers are a form of kernel smoothing [1]. It's literally interpolation [2]. That doesn't mean it can only echo the exact items in its training data - generating new data items is the entire point of interpolation - but it does mean it's "remixing" (literally forming a weighted sum of) those items and we would expect it to lose fidelity when moving outside the area covered by those points - i.e. where it attempts to extrapolate. And indeed we do see that, and for some reason we call it "hallucinating".

    The subsequent argument that "LLMs only remix" => "all knowledge is a remix" seems absurd, and I'm surprised to have seen it now more than once here. Humanity didn't get from discovering fire to launching the JWST solely by remixing existing knowledge.

    [1] http://bactra.org/notebooks/nn-attention-and-transformers.ht...

    [2] Well, smoothing/estimation but the difference doesn't matter for my point.

    • Its not clear to me that LLMs hallucinating is because of they are extrapolating beyond their training data. Is that proven? Or are you extrapolating?

      Even acknowledging it is interpolation, models can extrapolate slightly without making things up, within the range where the model still applies. Whos to say what this range is for an LLM operating in thousand dimensional space? As far as I can tell the main limiters to LLM creativity are guardrails we put in place for safety and usefulness.

      And what exactly is your proof that human ingenuity is not just pattern matching. Im sure a hypothesis can be put that fire was discovered by just adding up all known facts the people of those times knew and stumbling on something that put it all together. Sounds like knowledge remix + slight extrapolating to me.

      1 reply →

  • > Anyone who still keeps implying these are statistical parrots or whatever is just going to regret these decisions in the future.

    You know this is a false dichotomy right? You can treat and consider LLMs statistical parrots and at the same time take advantage of them.

    • Yes, but the immediate equivalent scenario to me is how people treated other people as slaves merely using them like machines. Sure you got use out of them, but was that the best use?

  • There are musicians who "remix" (sample) other artists music and make massive hits themselves.

    Not every solution needs to be unique, in many cases "remixing" existing solutions in an unique way is better and faster.

  • But all of my great ideas are purely from my own original inspiration, and not learning or pattern matching. Nothing derivative or remixed. /sarcasm

  • Yeah, Yann LeCun is just some luddite lol

    • I don't think he's a luddite at all. He's brilliant in what he does, but he can also be wrong in his predictions (as are all humans from time to time). He did have 3 main predictions in ~23-24 that turned out to be wrong in hindsight. Debatable why they were wrong, but yeah.

      In a stage interview (a bit after the "sparks of agi in gpt4" paper came out) he made 3 statemets:

      a) llms can't do math. They can trick us with poems and subjective prose, but at objective math they fail.

      b) they can't plan

      c) by the nature of their autoregressive architecture, errors compound. so a wrong token will make their output irreversibly wrong, and spiral out of control.

      I think we can safely say that all of these turned out to be wrong. It's very possible that he meant something more abstract, and technical at its core, but in the real life all of these things were overcome. So, not a luddite, but also not a seer.

      3 replies →

    • You don't understand Yann's argument. It's similar to Richard Sutton's, in that these things aren't thinking, they're emulating thinking, and the weak implicit world models that get built in the weights are insufficient for true "AGI."

      This is orthogonal to the issue of whether all ideas are essentially "remixes." For the record I agree that they are.

      1 reply →

Single-idea implementations ("one-trick ponies") will die off, and composites that are harder to disassemble will be worth more.