Comment by rvz
2 years ago
Rather than looking at the visuals of this network, it is more better to focus on the actual problem with these LLMs which the author already has shown:
With in the transformer section:
> As is common in deep learning, it's hard to say exactly what each of these layers is doing, but we have some general ideas: the earlier layers tend to focus on learning lower-level features and patterns, while the later layers learn to recognize and understand higher-level abstractions and relationships.
That is the problem and yet these black boxes are just as explainable as a magic scroll.
I find this problem fascinating.
For decades we’ve puzzled at how the inner workings of the brain works, and thought we’ve learned a lot we still don’t fully understand it. So, we figure, we’ll just make an artificial brain and THEN we’ll be able to figure it out.
And here we are, finally a big step closer to an artificial brain and once again, we don’t know how it works :)
(Although to be fair we’re spending all of our efforts making the models better and better and not on learning their low level behaviors. Thankfully when we decide to study them it’ll be a wee less invasive and actually doable, in theory.)
Is it a brain, though? As far as I understand it it's mostly stochastic calculations based on language or image patterns whose rule sets are static and immutable. Every conceivable idea of plasticity (and with that: a form of fake consciousness) is only present during training.
Add to that the fact that a model is being trained actively and the weights are given by humans and the possible realm of outputs is being heavily moderated by an army of faceless low paid workers I don't see any semblance of a brain but a very high maintenance indexing engine sold to the marketing departments of the world as a "brain".
The implementation is not important, its function is. It predicts the most probable next action based on previous observations. It is hypothesized that this is how the brain works as well. You may have an aversion to think of yourself as a predictive automaton, but functionally there is no need to introduce much more than that.
It's still a neural network, like your brain. It lacks plasticity and can't "learn" autonoumously but it's still one step closer in creating an artifical brain