Comment by soulofmischief
19 days ago
We know why they work, but not how. SotA models are an empirical goldmine, we are learning a lot about how information and intelligence organize themselves under various constraints. This is why there are new papers published every single day which further explore the capabilities and inner-workings of these models.
You can look at the weights and traces all you like with telemetry and tracing
If you don’t own the model then you have a problem that has nothing to do with technology
Ok, but the art and science of understanding what we're even looking at is actively being developed. What I said stands, we are still learning the how. Things like circuits, dependencies, grokking, etc.
[dead]