Comment by roughly
20 days ago
Two thoughts:
The first is that LLMs are bar none the absolute best natural language processing and producing systems we’ve ever made. They are absolutely fantastic at taking unstructured user inputs and producing natural-looking (if slightly stilted) output. The problem is that they’re not nearly as good at almost anything else we’ve ever needed a computer to do as other systems we’ve built to do those things. We invented a linguist and mistook it for an engineer.
The second is that there’s a maxim in media studies which is almost universally applicable, which is that the first use of a new media is to recapitulate the old. The first TV was radio shows, the first websites looked like print (I work in synthetic biology, and we’re in the “recapitulating industrial chemistry” phase). It’s only once people become familiar with the new medium (and, really, when you have “natives” to that medium) that we really become aware of what the new medium can do and start creating new things. It strikes me we’re in that recapitulating phase with the LLMs - I don’t think we actually know what these things are good for, so we’re just putting them everywhere and redoing stuff we already know how to do with them, and the results are pretty lackluster. It’s obvious there’s a “there” there with LLMs (in a way there wasn’t with, say, Web 3.0, or “the metaverse,” or some of the other weird fads recently), but we don’t really know how to actually wield these tools yet, and I can’t imagine the appropriate use of them will be chatbots when we do figure it out.
Transformers still excel at translation, which is what they were originally designed to do. It's just no longer about translating only language. Now it's clear they're good at all sorts of transformations, translating ideas, styles, etc. They represent an incredibly versatile and one-shot programmable interface. Some of the most successful applications of them so far are as some form of interface between intent and action.
And we are still just barely understanding the potential of multimodal transformers. Wait till we get to metamultimodal transformers, where the modalities themselves are assembled on the fly to best meet some goal. It's already fascinating scrolling through latent space [0] in diffusion models, now imagine scrolling through "modality space", with some arbitrary concept or message as a fixed point, being able to explore different novel expressions of the same idea, and sample at different points along the path between imagery and sound and text and whatever other useful modalities we discover. Acid trip as a service.
[0] https://keras.io/examples/generative/random_walks_with_stabl...
Something that has been bugging me is that, applications-wise, the exploitative end of the "exploitation-exploration" trade-off (for lack of a better summary) have gotten way more attention than the other side.
So, besides the complaints about accuracy, hallucinations (you said "acid trip") are dissed much more than would have been necessary.
Yeah, I think if I had to put money on where the "native lands" of the LLMs are, it's in a much deeper embrace of the large model itself - the emergent relationships, architectures, and semantics that the models generate. The chatbots have stolen the spotlight, but if you look at the use of LLMs for biology, and specifically the protein models, that's one area where they've been truly revolutionary, because they "get" the "language" of proteins in a way we simply did not before. That points at a more general property of the models - "language" is just relationships and meanings, so anywhere you have a large collection of existing "texts" that you can't read or don't really understand, the "analogy machines" are a potential game changer.
3 replies →
I haven't read Understanding Media by Marshall McLuhan, but I think he introduced your second point in that book, in 1964. He claims that the content of each new medium is a previous medium. Video games contain film, film contains theater, theater contains screenplay, screenplay contains literature, literature contains spoken stories, spoken stories contain folklore, and I suppose if one were an anthropologist, they could find more and more chain links in this chain.
It's probably the same in AI — the world needs AI to be chat (or photos, or movies, or search, or an autopilot, or a service provider ...) before it can grow meaningfully beyond. Once people understand neural networks, we can broadly advance to new forms of mass-application machine learning. I am hopeful that that will be the next big leap. If McLuhan is correct, that next big leap will be something that is operable like machine learning, but essentially different.
Here's Marc Andreessen applying it to AI and search on Lex Fridman's podcast: https://youtu.be/-hxeDjAxvJ8?t=160
Why are we comparing LLMs to media? I think media has much more freedom in a creative sense, its end goal is often very open-ended, especially when it's used for artistic purposes.
When it comes to AI, we're trying to replace existing technology with it. We want it to drive a car, write an email, fix a bug etc. That premise is what gives it economic value, since we have a bunch of cars/emails/bugs that need driving/writing/fixing.
Sure, it's interesting to think about other things it could potentially achieve when we think out of the box and find use cases that fit it more, but the "old things" we need to do won't magically go away. So I think we should be careful about such overgeneralizations, especially when they're covertly used to hype the technology and maintain investments.
Media in this case is a plural of medium — something that both contains information and describes its interface.
I think the idea is a bit different than what you describe. New media contains in itself the essence of old media, but it does not necessarily supersede it. For example, we have theater and film.
This “rule” of media doesn’t help us predict how or whether AI will evolve, so it is difficult to relate it to hyping. It is an exclusionary heuristic for future predictions — it helps us exclude unlikely ones. But doesn’t help us come up with any.
I personally am hopeful that AI will evolve into something else that has more essence to it than mere function. But that’s just hope, which is rather less promising than hype.
Oral cultures had theater.
It was a mistake to call LLMs "AI". Now people expect it to be generic.
OpenAI has been pushing the idea that these things are generic—and therefore the path to AGI—from the beginning. Their entire sales pitch to investors is that they have the lead on the tech that is most likely to replace all jobs.
If the whole thing turns out to be a really nifty commodity component in other people's pipelines, the investors won't get a return on any kind of reasonable timetable. So OpenAI keeps pushing the AGI line even as it falls apart.
I mean we don’t know that they’re wrong? Not “all jobs” but many of the white collar jobs we have today?
I work in medical insurance billing and there are hundreds of thousands (minimum) jobs that could be made obsolete on the payer and clinic side by LLMs. The translation between PDF of a payer’s rates and billing rules => standardized 837 or API request to a clearinghouse is…not much. And then on the other side, Claude Code could build you an adjudication engine in a few quarters.
The incentive structures to change healthcare in that way will fight back for a decade, but there are a _lot_ of jobs at stake.
Then you think about sales. LLMs can negotiate contracts themselves. Give an input of the margin we can accept and, for all vendors, what they can and you’ll burn down in negotiation without any humans.
It’s not all jobs, but it’s millions.
2 replies →
OpenAI models and other multi-modal models are about as generalized as we can get at this point time.
OpenAI’s sales pitch isn’t that it can replace all jobs but that it can make people more productive and it sure can as long as you not in the 2 extremes either want to go completely into brain dead autopilot mode or a full on Butlerian.
1 reply →
First of all, "AI" is and always has been a vague term with a shifting definition. "AI" used to mean state search programs or rule-based reasoning systems written in LISP. When deep learning hit, lots of people stopped considering symbolic (i.e., non neural-net) AI to be AI. Now LLMs threaten to do the same to older neural-net methods. A pedantic conversation about what is and isn't true AI is not productive.
Second of all, LLMs have extremely impressive generic uses considering that their training just consists of consuming large amounts of unsorted text. Any counter argument about "it's not real intelligence" or "it's just a next-token predictor" ignores the fact that LLMs have enabled us to do things with machines that would have seemed impossible just a few years ago. No, they are not perfect, and yes there are lots of rough edges, but the fact that simply "solving text" has gotten us this far is huge and echoes some aspects of the Unix philosophy...
"Write programs to handle text streams, because that is a universal interface."
> A pedantic conversation about what is and isn't true AI is not productive.
It's not at all 'pedantic' and while it's not productive to be having to rail against this stupid term, that is not the fault of the people pushing back at it. It's the fault of the hype merchants who have promoted it.
A key part of thinking independently is to be continually questioning the use of language.
> Any counter argument about "it's not real intelligence" or "it's just a next-token predictor" ignores the fact that LLMs have enabled us to do things with machines that would have seemed impossible just a few years ago.
No, it's entirely possible to appreciate that LLMs are a very powerful and useful technology while also pointing out that they are not 'intelligence' in any meaningful sense of the word and that labeling them 'artificial intelligence' is unhelpful to users and, ultimately, to the industry.
> "AI" used to mean state search programs or rule-based reasoning systems written in LISP. When deep learning hit, lots of people stopped considering symbolic (i.e., non neural-net) AI to be AI. Now LLMs threaten to do the same to older neural-net methods. A pedantic conversation about what is and isn't true AI is not productive.
I think you are misstating the problem here.
All of the things you name are still AI.
None of the things you name are, or have ever been, AI.
The problem is that there is AI, the computer science subfield of artificial intelligence, which includes things like expert systems, NPCs in games, and LLMs, and then there is AI, the "true" artificial intelligence, brought to us exclusively by science fiction, which includes things (or people!) like Commander Data, Skynet, Durandal, and HAL 9000.
The general public doesn't understand this distinction in a deep way—even those who recognize that things like Skynet are fiction get confused when they see an LLM apparently able to carry on a coherent conversation with a human—and too many of us, who came into this with a basic understanding of the distinction and who should know better, have bought the hype (and in some cases outright lies) of companies like OpenAI wholesale.
These facts (among others) have combined to allow the various AI grifters to continue operating without being called out on their bullshit.
They're pretty AI to me . I've been using chat gpt to explain things to me while learning a foreign language, and a native speaker has been overseeing the comments from it. it hasn't said anything that the native has disagreed with yet.
I reckon you’re proving their point. You’re using a large language model for language-specific tasks. It ought to be good at that, but it doesn’t mean it is generic artificial intelligence.
1 reply →
Like the OP said "LLMs are bar none the absolute best natural language processing and producing systems we’ve ever made".
They may not be good at much else.
Yes, but your use case is language. I use LLMs for all kind of stuff from programming, creative work, etc. so I know it's useful even elsewhere. But as the generic term "AI" is being used, people expect it to be good at everything a human can be good at and then whine about how stupid the "AI" is.
I tried the same with another foreign language. Every native speaker have told the answers are crap.
4 replies →
I wonder.
People primarily communicate thru words, so maybe not.
Of course, pictures, body language, and also tone are also other communication methods.
So far it looks like these models can convert pictures into words reasonably well, and the reverse is improving quickly.
Tone might be next - there are already models that can detect stress so that’s a good first start.
Body language is probably a bit farther in the future, but it might be as simple as image analysis (thats only a wild guess-I have no idea)
Most grounded and realistic take on the AI hype I've read recently.
> It’s obvious there’s a “there” there with LLMs (in a way there wasn’t with, say, Web 3.0, or “the metaverse,” or some of the other weird fads recently)
There is a "there" with those other fads too. VRChat is a successful "metaverse" and Mastodon is a successful decentralized "web3" social media network. The reason these concepts are failures is because these small grains of success are suddenly expanded in scope to include a bunch of dumb ideas while the expectations are raised to astronomical levels.
That in turn causes investors throw stupid amounts of money at these concepts, which attracts all the grifters of the tech world. It smothers nacant new tech in the crib as it is suddenly assigned a valuation it can never realize while the grifters soak up all the investments that could've gone to competent startups.
>Mastodon is a successful decentralized "web3" social media network.
No, that's not what "web3" means. Web3 is all about the blockchain (or you can call it "distributed ledger technology" if you want to distance it from cryptocurrency scams).
There's nothing blockchain-y about Mastodon or the ActivityPub protocol.
web3 means different things to different people, much like how AI-powered means different things to different people.
All that matters is that the moneyed class is able to choose a slightly related definition of it that generates speculative value.
The fundamental idea behind web3 was decentralization. Of course the blockchain was seen as the primary method to drive that decentralization. Mainly so people could drive up the valuation of their crypto currencies.
Cryptocurrencies themselves could also serve as a example of a grain of success for web3.
> We invented a linguist and mistook it for an engineer.
That's not entirely true, either. Because LLMs _can_ write code, sometimes even quite well. The problem isn't that they can't code, the problem is that they aren't reliable.
Something that can code well 80% of time is as useful as something that can't code at all, because you'd need to review everything it writes to catch that 20%. And any programmer will know that reviewing code is just as hard as writing it in the first place. (Well, that's unless you just blindly trust whatever it writes. I think kids these days call that "vibe coding"....)
If that were the case, I wouldn't be using Cursor to write my code. It's definitely faster to write with Cursor, because it basically always knows what I was going to write myself anyway, so it saves me a ton of time.
>We invented a linguist and mistook it for an engineer.
People are missing the point. LLMs aren’t just fancy word parrots. They actually grasp something about how the world works. Sure, they’re still kind of stupid. Imagine a barely functional intern who somehow knows everything but can’t be trusted to file a document without accidentally launching a rocket.
Where I really disagree with the crowd is the whole “they have zero intelligence” take. Come on. These things are obviously smarter than some humans. I’m not saying they’re Einstein, but they could absolutely wipe the floor with someone who has Down syndrome in nearly every cognitive task. Memory, logic, problem-solving — you name it. And we don’t call people with developmental disorders letdowns, so why are we slapping that label on something that’s objectively outperforming them?
The issue is they got famous too quickly. Everyone wanted them to be Jarvis, but they’re more like a very weird guy on Reddit with a genius streak and a head injury. That doesn’t mean they’re useless. It just means we’re early. They’ve already cleared the low bar of human intelligence in more ways than people want to admit.
Thanks for a thoughtful post.
The fantastically intoxicating valuations of many current stocks is due to breathing the fumes of LLMs as artificial intelligence.
TFA puts it this way:
Now to consider your two points...
> The first ... natural language querying.
Natural-language inputs are structured: they are language. But in any case, we must not minimise the significant effort to collect [0] and label trustworthy data for training. Given untrustworthy, absurd, and/or outright ignorant and wrong training data, an LLM would spew nonsense. If we train an LLM on tribalistic fictions, Reddit codswallop, or politicians' partisan ravings, what do you think the result of any rational natural-language query would be? (Rhetorical question.)
In short, building and labelling the corpus of knowledge is the essential technical advancement. We already have been doing natural-language processing with computers for a long time.
> The second ... new media recaptiulates the old.
LLMs are a new application. There are some effective uses of the new application. But there are many unsuitable applications, particularly where correctness is critical. (TFA mentions this.) There are illegal uses too.
TFA itself says,
I agree that finding the profit models beyond stock hyperbole is the current endeavour. Some attempts are already proven: better Web search (with a trusted corpus), image scoring/categorisation, suggesting/drafting approximate solutions to coding or writing tasks.
How to monetise these and future implementations will determine whether LLMs devour anything serviceable the way Radio ate Theatre, the way TV ate Theatre, Radio and Print Journalism, the way the Internet ate TV, Radio, the Music Industry, and Print Journalism, and the way Social Media ate social discourse.
<edit: Note that the above devourings were mostly related to funding via advertising.>
If LLMs devour and replace the Village Idiot, we will have optimised and scaled the worst of humanity.
= = =
[0] _ major legal concerns to be unresolved
[1] _ https://en.wikipedia.org/wiki/Network_(1976_film) , https://www.npr.org/2020/09/29/917747123/you-literally-cant-...
I actually believe the practical use of transformers, diffusers etc is already as impactful as the wide adoption of the internet. Or smartphones or cars. Its already used by hundreds of millions and it became an irreplaceable tool to enhance work output. And it just started. In 5 years from now it will dominate every single part of our lifes.