← Back to context

Comment by jmount

8 hours ago

I really think one needs a "Harvard architecture" for AIs (data independent of instructions). Though yes, that may not be possible.

It's not possible with today's LLM models, but we are not wedded to the current architecture.

  • Realistically, we are.

    This is not some arbitrary design choice, it's the core compromise to make LLMs viable to train at all.

    • Define "realistically". You're basically saying attention is all we need indefinitely into the future and all other gains come from more compute or scaffolding around current architectures.

      Attention is all we need because it is currently the best parallelizable way to model long-range dependencies on current hardware constraints, not because flat tokens yield some natural law of intelligence inherently.

      Who's to say we won't find a way to encode provenance or privilege natively into models such that the tradeoff changes?

      It's hard to say what the solution will be. If I knew it, I'd build it. But it's even harder to sustain that the current architecture is a crystalized global optimum.

      3 replies →

I doubt it's possible, regardless of specific architecture, because if you want an AI that can do general purpose tasks like "look at my calendar and find a restaurant for the lunch meeting that the other people also like, but make sure nobody has to travel more than 20 minutes to get there, and it can't be too cold inside", then it has to ingest and understand a bunch of data to do that. The whole point is that the decision-making process is reading everything. The only "fix" is to make an AI smart enough that it can understand context for each item, which is a tall order.

  • > The only "fix" is to make an AI smart enough that it can understand context for each item, which is a tall order.

    Impossible as you said. Context isn’t static, it’s continuous, analog, and a conglomeration of viewpoints.

    AI cannot create useful context for itself because it is a machine with no desires. It doesn’t have a point of view, it has historical records. It moves forward in time by walking backwards (if that makes sense?)

  • This is especially true because so much of that data comes from outside of your organization. I receive Google Calendar invites from scammers a couple of times a week and those show up in my invitation list just like anything else. If LLMs start screening things, that kind of thing will become even more popular but most of us can’t just ignore everyone outside of our employer’s directory.