Comment by davidhyde
20 days ago
My take on the difference between now and then is “effort”. All those things mentioned above are now effortless but the door to “effort” remains open as it always has been. Take the first point for example. Those little black boxes of AI can be significantly demystified by, for example, watching a bunch of videos (https://karpathy.ai/zero-to-hero.html) and spending at least 40 hours of hard cognitive effort learning about it yourself. We used to purchase software or write it ourselves before it became effortless to get it for free in exchange for ads and then a subscription when we grew tired of ads or were tricked into bait and switch. You can also argue that it has never been easier to write your own software than it is today.
Hostile operating systems. Take the effort to switch to Linux.
Undocumented hardware, well there is far more open source hardware out there today and back in the day it was fun to reverse engineer hardware, now we just expect it to be open because we couldn’t be bothered to put in the effort anymore.
Effort gives me agency. I really like learning new things and so agentic LLMs don’t make me feel hopeless.
I’ve worked in the AI space and I understand how LLMs work as a principle. But we don’t know the magic contained within a model after it’s been trained. We understand how to design a model, and how models work at a theoretical level. But we cannot know how well it will be at inference until we test it. So much of AI research is just trial and error with different dials repeated tweaked until we get something desirable. So no, we don’t understand these models in the same way we might understand how an hashing algorithm works. Or a compression routine. Or an encryption cypher. Or any other hand-programmed algorithm.
I also run Linux. But that doesn’t change how the two major platforms behave and that, as software developers, we have to support those platforms.
Open source hardware is great but it’s not on the same league of price and performance as proprietary hardware.
Agentic AI doesn’t make me feel hopeless either. I’m just describing what I’d personally define as a “golden age of computing”.
but isn't this like a lot of other CS-related "gradient descent"?
when someone invents a new scheduling algorithm or a new concurrent data structure, it's usually based on hunches and empirical results (benchmarks) too. nobody sits down and mathematically proves their new linux scheduler is optimal before shipping it. they test it against representative workloads and see if there is uplift.
we understand transformer architectures at the same theoretical level we understand most complex systems. we know the principles, we have solid intuitions about why certain things work, but the emergent behavior of any sufficiently complex system isn't fully predictable from first principles.
that's true of operating systems, distributed databases, and most software above a certain complexity threshold.
No. Algorithm analysis is much more sophisticated and well defined than that. Most algorithms are deterministic, and it is relatively straightforward to identify complexity, O(). Even nondeterministic algorithms we can evaluate asymptotic performance under different categories of input. We know a lot about how an algorithm will perform under a wide variety of input distributions regardless of determinism. In the case of schedulers, and other critical concurrency algorithms, performance is well known before release. There is a whole subfield of computer science dedicated to it. You don't have to "prove optimality" to know a lot about how an algorithm will perform. What's missing in neural networks is the why and how any inputs will propagate, through the network during inference. It is a black box of understandability. Under a great deal of study, but still very poorly understood.
2 replies →
>Those little black boxes of AI can be significantly demystified by, for example, watching a bunch of videos (https://karpathy.ai/zero-to-hero.html) and spending at least 40 hours of hard cognitive effort learning about it yourself.
That's like saying you can understand humans by watching some physics or biology videos.
No it’s not
Nobody has built a human so we don’t know how they work
We know exactly how LLM technology works
We know _how_ it works but even Anthropic routinely does research on its own models and gets surprised
> We were often surprised by what we saw in the model
https://www.anthropic.com/research/tracing-thoughts-language...
10 replies →
We know why they work, but not how. SotA models are an empirical goldmine, we are learning a lot about how information and intelligence organize themselves under various constraints. This is why there are new papers published every single day which further explore the capabilities and inner-workings of these models.
3 replies →