Comment by almosthere

5 days ago

The code that builds the models and performance inference from it is code we have written. The data in the model is obviously the big trick. But what I'm saying is that if you run inference, that alone does not give it super-powers over your computer. You can write some agentic framework where it WOULD have power over your computer, but that's not what I'm referring to.

It's not a living thing inside the computer, it's just the inference building text token by token using probabilities based on the pre-computed model.

> It's not a living thing inside the computer, it's just the inference building text token by token using probabilities based on the pre-computed model.

Sure, and humans are just biochemical reactions moving muscles as their interface with the physical word.

I think the model of operation is not a good criticism, but please see my reply to the root comment in this thread where I detail my thoughts a bit.

You cannot say, 'we know it's not thinking because we wrote the code' when the inference 'code' we wrote amounts to, 'Hey, just do whatever you figured out during training okay'.

'Power over your computer', all that is orthogonal to the point. A human brain without a functioning body would still be thinking.

  • Well, a model by itself with data that emits a bunch of human written words is literally no different than what JIRA does when it reads a database table and shits it out to a screen, except maybe a lot more GPU usage.

    I permit you, that yes, the data in the model is a LOT more cool, but some team could by hand, given billions of years (well probably at least 1 Octillion years), reproduce that model and save it to a disk. Again, no different than data stored in JIRA at that point.

    So basically if you have that stance you'd have to agree that when we FIRST invented computers, we created intelligence that is "thinking".

    • >Well, a model by itself with data that emits a bunch of human written words is literally no different than what JIRA does when it reads a database table and shits it out to a screen, except maybe a lot more GPU usage.

      Obviously, it is different or else we would just use JIRA and a database to replace GPT. Models very obviously do NOT store training data in the weights in the way you are imagining.

      >So basically if you have that stance you'd have to agree that when we FIRST invented computers, we created intelligence that is "thinking".

      Thinking is by all appearances substrate independent. The moment we created computers, we created another substrate that could, in the future think.

      3 replies →

    • You're getting to the heart of the problem here. At what point in evolutionary history does "thinking" exist in biological machines? Is a jumping spider "thinking"? What about consciousness?

This is a bad take. We didn't write the model, we wrote an algorithm that searches the space of models that conform to some high level constraints as specified by the stacked transformer architecture. But stacked transformers are a very general computational paradigm. The training aspect converges the parameters to a specific model that well reproduces the training data. But the computational circuits the model picks out are discovered, not programmed. The emergent structures realize new computational dynamics that we are mostly blind to. We are not the programmers of these models, rather we are their incubators.

As far as sentience is concerned, we can't say they aren't sentient because we don't know the computational structures these models realize, nor do we know the computational structures required for sentience.

  • However there is another big problem, this would require a blob of data in a file to be labelled as "alive" even if it's on a disk in a garbage dump with no cpu or gpu anywhere near it.

    The inference software that would normally read from that file is also not alive, as it's literally very concise code that we wrote to traverse through that file.

    So if the disk isn't alive, the file on it isn't alive, the inference software is not alive - then what are you saying is alive and thinking?

    • This is an overly reductive view of a fully trained LLM. You have identified the pieces, but you miss the whole. The inference code is like a circuit builder, it represents the high level matmuls and the potential paths for dataflow. The data blob as the fully converged model configures this circuit builder in the sense of specifying the exact pathways information flows through the system. But this isn't some inert formalism, this is an active, potent causal structure realized by the base computational substrate that is influencing and being influenced by the world. If anything is conscious here, it would be this structure. If the computational theory of mind is true, then there are some specific information dynamics that realize consciousness. Whether or not LLM training finds these structures is an open question.

    • A similar point was made by Jaron Lanier in his paper, "You can't argue with a Zombie".

    • > So if the disk isn't alive, the file on it isn't alive, the inference software is not alive - then what are you saying is alive and thinking?

      “So if the severed head isn’t alive, the disembodied heart isn’t alive, the jar of blood we drained out isn’t alive - then what are you saying is alive and thinking?”

      - Some silicon alien life forms somewhere debating whether the human life form they just disassembled could ever be alive and thinking

      1 reply →