← Back to context

Comment by soulofmischief

19 days ago

We know why they work, but not how. SotA models are an empirical goldmine, we are learning a lot about how information and intelligence organize themselves under various constraints. This is why there are new papers published every single day which further explore the capabilities and inner-workings of these models.

You can look at the weights and traces all you like with telemetry and tracing

If you don’t own the model then you have a problem that has nothing to do with technology

  • Ok, but the art and science of understanding what we're even looking at is actively being developed. What I said stands, we are still learning the how. Things like circuits, dependencies, grokking, etc.