Comment by nandomrumber
9 hours ago
What explains the emergent abilities of generative pre-trained transformers at massive-scale? Abilities that the smaller GTP’s don’t possess.
Simple programs can give rise to very complex behaviour. Conway’s game of live is Turing Complete and has four rules.
Conway’s Game of Live can simulate a Turing machine, can therefore implant a GTP.
Does that mean Conway’s Game of Life is conscious? I don’t think so.
Does it rule out Conway’s Game of life from implementing a system that has consciousness as an emergent ability?
I’m not convinced I know the answer.
"What explains the emergent abilities of generative pre-trained transformers at massive-scale? Abilities that the smaller GTP’s don’t possess."
What "emergent" abilities do you mean? In my experience, smaller models behave exactly as I would expect a model with a lot fewer data and fewer connections between the data to behave. It is a difference of scale and not of kind when comparing Gemma 4 E2B (which runs on literally any modern computing device, including a CPU in a modest laptop or phone) to the current frontier models. Each step up adds more knowledge of how to do more things, and more working memory and tool capability to do more, but it does not look anything like a line being crossed into sentience, to me. They all still seem like machines. If you compare outputs across each step up in size and capability, which is something I've done, you'll see incremental improvements. You won't see a sudden spark where it's a different type of thing, it's just gradually getting more capable.
I think the memory features companies are sticking on these things is detrimental to mental health. It adds to the illusion that there's something else happening, other than some equations being calculated with some randomness thrown in. But, it's just the model querying the memory database (whatever form that takes) because it's been instructed to do so. The model doesn't want to know anything about who it's talking to. It's just following the system prompt. That doesn't make it your friend. Humans will see a face on the moon, that doesn't mean the moon will be my friend, either.
> What explains the emergent abilities of generative pre-trained transformers at massive-scale?
I don't see why the abilities couldn't be an encoded modelling of enough of the world to produce those abilities. It seems like a simple enough explanation. Less data, less room to build a model of how things work. More data, sufficient room to build a model.
Conway's Game of Life is then not conscious in and of itself, because there's not enough in its encoded data to result in emergent behaviour beyond what we see.
If we expand it to also include a vast amount of data such as a Turing machine running an LLM then we can reasonably say we are closer to saying that that configuration of it is conscious.
It's not the firing-of-neurons mechanism and its relevant complexity or simplicity that make us conscious or not.
It's not the GoL algorithm that would make the machine conscious either.
It's the emergent behaviour of a sufficiently complex system.
The system _including_ its data.