"But the fundamental problem is that LLMs don’t get better over time the way a human would. The lack of continual learning is a huge huge problem. The LLM baseline at many tasks might be higher than an average human's. But there’s no way to give a model high level feedback. You’re stuck with the abilities you get out of the box."
That does seem to be a problem with neural nets.
There are AIish systems that don't have this problem. Waymo's Driver, for example. Waymo has a procedure where, every time their system has a disconnect or near-miss, they run simulations with lots of variants on the troublesome situation. Those are fed back into the Driver.
Somehow. They don't say how. But it's not an end to end neural net. Waymo tried that, as a sideline project, and it was worse than the existing system. Waymo has something else, but few know what it is.
Yes, LLM don't improve as humans do, but they could use other tools for example designing programs in Prolog to expand their capabilities. I think that the next step in AI will be LLM being able to use better tools or strategies. For example, designing architectures in which logic rules, heuristic algorithms and small fine-tuned LLM agents are integrated as tools for LLMs. I think that new, more powerful architectures for helping LLMs are going to mature in the near future. Furthermore, there is the economic pushing force to develop AI application for warfare.
Edited: I should add, that a Prolog system could help the LLM to continue learning by adding facts to its database and inferring new relations, for example using heuristics to suggest new models or ways for exploration.
Correct me if I'm wrong, but from what I've heard, Waymo employs heuristics, rules, neural networks, and various other techniques that are combined and organized by humans into a system.
It's not an end-to-end neural network.
"AIish" is a good description. It is, by design, not AGI.
Waymo's system generates a map of the environment, with obstacles, other road users, and predictions of what other road users will do. That map can be evaluated, both by humans and by later data about what actually happened.
Passengers get to watch a simplified version of that map. Early papers from the Google self-driving project showed more detailed maps.
The driving system runs off that map.
So there's an internal model. Much criticism of AI has been lack of an internal model. This problem has an internal model. It's specialized, but well matched to its task.
We see this in other robotics efforts, where there's a model and a plan before there's action. Other kinds of AI, especially "agentic" systems, may need that kind of explicit internal model. In a previous posting, about an AI system which was supposed to plan stocking for a vending machine, I suggested that there should be a spreadsheet maintained by the system, so it didn't make obvious business mistakes.
Why so few ask: Isnt it enough that current SOTA AI make us humans already so much better everyday?
Exponential selfimprovement seems like a very scary thing to me and even if all went right, humans have to give up their pole position in the intelligence race.
That will be very hard to swallow for many.
If we really want self improvement, we should get used to beeing useless :)
I _hope_ AGI is not right around the corner, for social political reasons we are absolutely not ready for it and it might push the future of humanity into a dystopia abyss.
but also just taking what we have now with some major power usage reduction and minor improvements here and there already seems like something which can be very usable/useful in a lot of areas (and to some degree we aren't even really ready for that either, but I guess thats normal with major technological change)
it's just that for those companies creating foundational models it's quite unclear how they can recoup their already spend cost without either major break through or forcefully (or deceptively) pushing it into a lot more places then it fits into
When it comes to recouping cost, a lot of people don't consider the insane amount of depreciation expense brought on by the up to 1 trillion (depending on which estimate) that has been invested in AI buildouts. That depreciation expense could easily be more than the combined revenue of all AI companies.
If birth rates are as much a cause for concern as many people seem to think, and we absolutely need to solve it (instead of solving, for instance, the fact that the economy purportedly requires exponential population growth forever), perhaps we should hope that AGI comes soon.
I think it's rather easy for them to recoup those costs, if you can disrupt some industry with a full AI company with almost no employees and outcompete everyone else, that's free money for you.
I think they are trying to do something like this(1) by long term providing a "business suite", i.e. something comparable to g suite or microsoft 360.
For a lot of the things which work well with current AI technology it's supper convenient to have access to all your customer private data (even if you don't train on them, but e.g. stuff like RAG systems for information retrieval are one of the things which already with the current state of LLMs work quite well). This also allows you to compensate hallucinations, non understanding of LLMs and similar by providing (working) links (or inclusions of snippets of) sources where you have the information from and by having all relevant information in the context window of the LLM instead of it's "learned" data from training you in general get better results. I mean RAG systems already did work well without LLMs to be used in some information retrieval products.
And the thing is if your user has to manually upload all potentially relevant business documents you can't really make it work well, but what if they anyway upload all of them to your company because they use your companies file sharing/drive solution?
And lets not even consider the benefits you could get from a cheaper plan where you are allowed to train on the companies data after anonymizing (like for micro companies, too many people thing "they have nothing to hide" and it's anonymized so okay right? (no)). Or you going rogue and just steal trade secrets to then breach into other markets it's not like some bigger SF companies had been found to do exactly that (I think it was amazon/amazon basics).
(1:) Through in that case you still have employees until you AI becomes good enough to write all you code, instead of "just" being a tool for developers to work faster ;)
For me, "AGI" would come in with being able to reliably perform simple open-ended tasks successfully without needing any specialized aid or tooling. Not necessarily very well, just being capable of it in the first place.
For a specific example of what I mean, there's Vending-Bench - even very 'dumb' humans could reliably succeed on that test indefinitely, at least until they got terminally bored of it. Current LLMs, by contrast, are just fundamentally incapable of that, despite seeming very 'smart' if all you pay attention to is their eloquence.
At the very least, it needs to be able to collate training data, design, code, train, fine tune and "RLHF" a foundational model from scratch, on its own, and have it show improvements over the current SOTA models before we can even begin to have the conversation about whether we're approaching what could be AGI at some point in the future.
as in it can learn by itself to solve any kind of generic task it can practically interface it (at lest which isn't way to complicated).
to some degree LLMs can do so theoretically but
- learning (i.e. training them) is way to slow and costly
- domain adoption (later learning) often has a ton of unintended side effects (like forgetting a bunch of important previously learned things)
- it can't really learn by itself in a interactive manner
- "learning" by e.g. retrieving data from knowledge data base and including it into answers (e.g. RAG) isn't really learning but just information retrieval, also it has issues with context windows and planing
I could imagine OpenAI putting together multiple LLMs + RAG + planing systems etc. to create something which technically could be named AGI but which isn't really the break through people associate with AGI in the not too distant future.
That would be human; I've always understood the General to mean 'as if it's any human', i.e. perhaps not absolute mastery, but trained expertise in any domain.
Anyone who claims that a poorly definined concept, AGI, is right around the corner is most likely:
- trying to sell something
- high on their own stories
- high on exogenous compounds
- all of the above
LLMs are good at language. They are OK summarizers of text by design but not good at logic. Very poor at spatial reasoning and as a result poor at connecting concepts together.
Just ask any of the crown jewel LLM models "What's the biggest unsolved problem in the [insert
any] field".
The usual result is a pop-science-level article but with ton of subtle yet critical mistakes! Even worse, the answer sounds profound on the surface. In reality, it's just crap.
They’re great at working with the lens on our reality that is our text output. They are not truth seekers, which is necessarily fundamental to every life form from worms to whales. If we get things wrong, we die. If they get them wrong, they earn 1000 generated tokens.
Why do you say that LLMs are not truth seekers? If I express an informational query not very well, the LLM will infer what I mean by it and address the possible well-posed information queries that I may have intended that I did not express well.
Can that not be considered truth-seeking, with the agent-environment boundary being the prompt box?
Yes, but you're missing their ability to interpolate across that dataset at retrieval time, which is what makes them extremely useful. Also, people are willing to invest a lot of money to keep building those datasets, until nearly everything of economic value is in there.
Exactly I am so tired to hear about AI… And they are not even AI! I am also losing faith in this field when I see how much they all push so much hype and lies like this instead of being transparent. They are not AGIs not even AIs… For now they are only models and your definition is a good one
LLMs are useful in that respect. As are media diffusion models. They've compressed the physics of light, the rules of composition, the structure of prose, the knowledge of the internet, etc. and made it infinitely remixable and accessible to laypersons.
AGI, on the other hand, should really stand for Aspirationally Grifting Investors.
Superintelligence is not around the corner. OpenAI knows this and is trying to become a hyperscaler / Mag7 company with the foothold they've established and the capital that they've raised. Despite that, they need a tremendous amount of additional capital to will themselves into becoming the next new Google. The best way to do that is to sell the idea of superintelligence.
AGI is a grift. We don't even have a definition for it.
LLMs require the sum total of human knowledge to ape what you can find on google, meanwhile Ramanujan achieved brilliant discoveries in mathematics using nothing but a grade school education and a few math books.
yeah I've been thinking about them as stochastic content addressable memory. You can put as many next = userInput; while(true's) { next = mem[next]; } around them as you need in different forms. Single shot. Agents. etc and get wildly cool results out, but it's gated by some of the limitations there.
Thousands are being laid off, supposedly because they're "being replaced with AI," implying the AI is as good or better as humans at these jobs. Managers and execs are workers, too--so if the AI really is so good, surely they should recuse themselves and go live a peaceful life with the wealth they've accrued.
I don't know about you, but I can't imagine that ever happening. To me, that alone is a tip off that this tech, while amazing, can't live up to the hype in the long term.
Some employees can be replaced by AI. That part is true. It's not revolutionary (at least not yet) — it's pretty much the same as other post-industrial technologies that have automated some types of work in the past. It also takes time for industries to adapt to these changes. Replacing workers couldn't possibly happen in one year, even if our AI models were more far more capable than they are in practice
I'm afraid that what we're seeing instead are layoffs that are purely oriented at the stock market. As long as layoffs and talk about AI are seen as a positive signal for investors and as long as corporate leadership is judged by the direction the stock price goes, we will see layoffs (as well as separate hiring sprees for "AI Engineers").
It's a telltale sign that we're seeing a large number of layoffs in the tech sector. It is true that tech companies are poised to adapt AI more quickly than others but that doesn't seem to be what's happening. What seem to be happening is that tech companies have been overhiring throughout the decade leading up to the end of COVID-19. At that time hiring was a positive signal — now firing is.
I don't think these massive layoffs are good for tech companies in the long term, but since they mostly affect things that don't touch direct revenue generating operations, they won't hurt in the near-term and by the time company starts feeling the pain, the cause would be too long in the past to be remembered.
I don't think anyone is being laid off because of AI. People are being laid off because the market is bad for a myriad of reasons, and companies are blaming AI because it helps them deflect worry that might lower their stock price.
Companies say "we've laid people off because we're using AI,x but they mean "we had to lay people off, were hoping we can make up for them with AI."
> Managers and execs are workers, too--so if the AI really is so good, surely they should recuse themselves and go live a peaceful life
One thing that doesn't get mentioned is AI capability for being held accountable. AI is fundamentally unaccountable. Like the genie from the lamp, it will grant you the 3 wishes but you bear the consequences.
So what can we do when the tasks are critically important, like deciding on an investment or spending much time and resources on a pursuit? We still need the managers. We need humans for all tasks of consequence where risks are taken. Not because humans are smarter, but because we have skin.
Even on the other side, that of goals, desires, choosing problems to be solved - AI has nothing to say. It has no desires of its own. It needs humans to expose the problem space inside which AI could generate value. It generates no value of its own.
This second observation means AI value will not concentrate in the hands of a few, but instead will be widespread. It's no different than Linux, yes, it has a high initial development cost, but then it generates value in the application layer which is as distributed as it gets. Each human using Linux exposes their own problems to the software to get help, and value is distributed across all problem contexts.
I have come to think that generating the opportunity for AI to provide value, and then incurring the outcomes, good or bad, of that work, are fundamentally human and distributed across society.
> Thousands are being laid off, supposedly because they're "being replaced with AI," implying the AI is as good or better as humans at these jobs.
I don't think the "implying the AI is as good or better as humans" part is correct. While they may not be saying it loudly, I think most folks making these decisions around AI and staffing are quite clear that AI is not as good as human workers.
They do, however, think that in many cases it is "good enough". Just look at like 90%+ of the physical goods we buy these days. Most of them are almost designed to fall apart after a few years. I think it's almost exactly analogous to the situation with the Luddites (which is often falsely remembered as the Luddites being "anti-technology", when in reality they were just "pro-not-starving-to-death"). In that case, new mechanized looms greatly threatened the livelihood of skilled weavers. The quality of the fabric from these looms tended to be much worse than those of the skilled weavers. But it was still "good enough" for most people such that most consumers preferred the worse but much cheaper cloth.
It's the same thing with AI. It's not that execs think it's "as good as humans", it's that if AI costs X to do something, and the human costs 50X (which is a fair differential I think), execs think people will be willing to put up with a lot shittier quality if the can be delivered something much more cheaply.
One final note - in some cases people clearly do prefer the quality of AI. There was an article on HN recently discussing that folks preferred Waymo taxis, even though they're more expensive.
Every few weeks I give LLMs a chance to code something for me.
Friday I laid out a problem very cleanly. Take this datastructure and tranform it into this other datastructure in terraform. With examples of the data in both formats.
After the seventh round of back and forth where it would give me code that would not compile or code that gave me a totally different datastructure, giving it more examples and clarifications all the while I gave up. I gave the problem to a junior and they came back with the answer in about an hour.
Next time an AI bro tells you that AI can 'replace your juniors' tell him to go to hell.
My company is desperately trying to incorporate AI (to tell investors they are). The fact that LLM gets thing wrong is a huge problem since most work can’t be wrong and if if a human needs to carefully go through output to check it, it’s often just as much work as having that same human just create the output themselves.
But languages is one place LLMs shine. We often need to translate technical docs to layman language and LLMs work great. It quickly find words and phrases to describe complex topics. Then a human can do a final round of revisions.
But anything de novo? Or requiring logic? It works about as well as a high school student with no background knowledge.
Fundamentally, they are really powerful text transformers with some additional capability. The further away from that sweet spot and the closer to anthropomorphization the more unreliable the output
I would call them "generally applicable". "intelligence" definitely implies leaning - and I'm not sure RAG, fine-tuning, or 6monthly updates counts - to split hairs.
Where I will say we have a massive gap, which makes the average person not consider it AGI, is in context. I can give a person my very modest codebase, and ask for a change, and they'll deliver - mostly coherently - to that style, files in the right place etc. Still to today with AI, I get inconsistent design, files in random spots, etc.
> I find LLMs to be generally intelligent. So I feel like "we are already there" -- by some definition of AGI. At least how I think of it.
I don't disagree - they are useful in many cases and exhibit human like (or better) performance in many tasks.
However they cannot simply be a "drop in white collar worker" yet, they are too jagged and unreliable, don't have a real memory etc. Their economic impact is still very much limited.
I think this is what many people mean when they say AGI - something with a cognitive performance so good it equals or beats humans in the real world, at their jobs - not at some benchmark.
One could ask - does it matter ? Why can't we say the current tools are great task solvers and call it AGI even if they are bad agents? It's a lengthy discussion to have but I think that ultimately yes, agentic reliability really matters.
I agree with the last part but I think that criticism applies to many humans too so I don't find it compelling at all.
I also think by original definition (better than median human at almost all task) it's close and I think in the next 5 years it will be competitive with professionals at all tasks which are nonphysical (physical could be 5-10 years idk). I could be high on my own stories but not the rest.
LLMs are good at language yes but I think to be good at language requires some level of intelligence. I find this notion that they are bad at spatial reasoning extremely flawed. They are much better than all previous models, some of which are designed for spatial reasoning. Are they worse than humans? Yes but just the fact that you can put newer models on robots and they just work means that they are quite good by AI standards and rapidly improving.
An AI (a computer program) that is better at [almost] any task than 5% of the human specialists in that field has achieved AGI.
Or, stated another way, if 5% of humans are incapable of performing any intellectual job better than an AI can, then that AI has achieved AGI.
Note, I am not saying that an AI that is better than humans at one particular thing has achieved AGI, because it is not "general". I'm saying that if a single AI is better at all intellectual tasks than some humans, the AI has achieved AGI.
The 5th percentile of humans deserves the label of "intelligent", even if they are not the most intelligent, (I'd say all humans deserve the label "intelligent") and if an AI is able to perform all intellectual tasks better than such a person, the AI has achieved AGI.
Take the Artificial out of AGI. What is GI, and do the majority of humans have it? If so, then why is your definition of AGI far stricter than the definition of Human GI?
However, it's not sufficient. The actual tasks have to be written down, tests constructed, and the specialists tested.
A subset of this has been done with some rigor and AI/computers have surpassed this threshold for some tests. Some have then responded by saying that it isn't AGI, and that the tasks aren't sufficiently measuring of "intelligence" or some other word, and that more tests are warranted.
I think any task-based assessment of intelligence is missing the mark. Highly intelligent people are not considered smart just because they can accomplish tasks.
Definitions around AI have been changing since the beginning, making it always farther in the future. In this system it can always be "right around the corner" but never arrive.
You’ve got people foaming at the mouth anytime someone mentions AGI, like it’s some kind of cult prophecy. “Oh it’s poorly defined, it’s not around the corner, everyone talking about it is selling snake oil.” Give me a break. You don’t need a perfect definition to recognize that something big is happening. You just need eyes, ears, and a functioning brain stem.
Who cares if AGI isn’t five minutes away. That’s not the point. The point is we’ve built the closest thing to a machine that actually gets what we’re saying. That alone is insane. You type in a paragraph about your childhood trauma and it gives you back something more coherent than your therapist. You ask it to summarize a court ruling and it doesn’t need to check Wikipedia first. It remembers context. It adjusts to tone. It knows when you’re being sarcastic. You think that’s just “autocomplete”? That’s not autocomplete, that’s comprehension.
And the logic complaints, yeah, it screws up sometimes. So do you. So does your GPS, your doctor, your brain when you’re tired. You want flawless logic? Go build a calculator and stay out of adult conversations. This thing is learning from trillions of words and still does better than half the blowhards on HN. It doesn’t need to be perfect. It needs to be useful, and it already is.
And don’t give me that “it sounds profound but it’s really just crap” line. That’s 90 percent of academia. That’s every selfhelp book, every political speech, every guy with a podcast and a ring light. If sounding smarter than you while being wrong disqualifies a thing, then we better shut down half the planet.
Look, you’re not mad because it’s dumb. You’re mad because it’s not that dumb. It’s close. Close enough to feel threatening. Close enough to replace people who’ve been coasting on sounding smart instead of actually being smart. That’s what this is really about. Ego. Fear. Control.
So yeah, maybe it’s not AGI yet. But it’s smarter than the guy next to you at work. And he’s got a pension.
Something big is definitely happening but it's not the intelligence explosion utopia that the AI companies are promising.
> Who cares if AGI isn’t five minutes away. That’s not the point. The point is we’ve built the closest thing to a machine that actually gets what we’re saying. That alone is insane. You type in a paragraph about your childhood trauma and it gives you back something more coherent than your therapist. You ask it to summarize a court ruling and it doesn’t need to check Wikipedia first. It remembers context. It adjusts to tone. It knows when you’re being sarcastic. You think that’s just “autocomplete”? That’s not autocomplete, that’s comprehension
My experience with LLMs have been all over the place. They're insanely good at comprehending language. As a side effect, they're also decent at comprehending complicated concepts like math or programming since most of human knowledge is embedded in language. This does not mean they have a thorough understanding of those concepts. It is very easy to trip them up. They also fail in ways that are not obvious to people who aren't experts on whatever is the subject of its output.
> And the logic complaints, yeah, it screws up sometimes. So do you. So does your GPS, your doctor, your brain when you’re tired. You want flawless logic? Go build a calculator and stay out of adult conversations. This thing is learning from trillions of words and still does better than half the blowhards on HN. It doesn’t need to be perfect. It needs to be useful, and it already is.
I feel like this is handwaving away the shortcomings a bit too much. It does not screw up in the same way humans do. Not even close. Besides, I think computers should rightfully be held up to a higher standard. We already have programs that can automate tasks that human brains would find challenging and tedious to do. Surely the next frontier is something with the speed and accuracy of a computer while also having the adaptability of human reasoning.
I don't feel threatened by LLMs. I definitely feel threatened by some of the absurd amount of money being put into them though. I think most of us here will be feeling some pain if a correction happens.
However, you've shifted the goal post from AGI to being useful in specific scenarios. I have no problem with that statement. It can write decent unit tests and even find hard-to-spot, trivial mistakes in code. But again, why can it do that? Because a version of that same mistake is in the enormous data set. It's a fantastic search engine!
Not only do I not think it is right around the corner. I'm not even convinced it is even possible or at the very least I don't think it is possible using conventional computer hardware. I don't think being able to regurgitate information in an understandable form is even an adequate or useful measurement of intelligence. If we ever crack artificial intelligence it's highly possible that in its first form it is of very low intelligence by humans standards, but is truly capable of learning on its own without extra help.
I think the only way that it’s actually impossible is if we believe that there’s something magical and fundamentally immeasurable about humans that leads to our general intelligence. Otherwise we’re just machines, after all. A human brain is theoretically reproducible outside standard biological mechanisms, if you have a good enough nanolathe.
Maybe our first AGI is just a Petri dish brain with a half-decent python API. Maybe it’s more sand-based, though.
-- A human brain is theoretically reproducible outside standard biological mechanisms, if you have a good enough nanolathe.
Sort of. The main issue is the energy requirements. We could theoretically reproduce a human brain in SW today, it's just that it would be a really big energy hog and run very slowly and probably become insane quickly like any person trapped in a sensory deprived tank.
The real key development for AI and AGI is down at the metal level of computers- the memristor.
The synapse in a brain is essentially a memristive element, and it's a very taxing one on the neuron. The equations is (change in charge)/(change in flux). Yes, a flux capacitor, sorta. It's the missing piece in fundamental electronics.
Making simple 2 element memristors is somewhat possible these days, though I've not really been in the space recently. Please, if anyone knows where to buy them, a real one not a claimed to be one, let me know. I'm willing to pay good money.
In Terms of AI, a memristor would require a total redesign of how we architect computers ( goodbye busses and physically separate memory, for one). But, you'd get a huge energy and time savings benefit. As in, you can run an LLM on a watch battery or small solar cell and let the environment train them to a degree.
Hopefully AI will accelerate their discovery and facilitate their introduction into cheap processing and construction of chips.
If by "something magical" you mean something we don't understand, that's trivially true. People like to give firm opinions or make completely unsupported statements they feel should be taken seriously ("how do we know humans intelligence doesn't work the same way as next token prediction") about something nobody understand.
A brain in a jar, with wires so that we can communicate with it, already exists. Its called the internet. My brain is communicating with you now through wires. Replacing my keyboard with implanted electrodes may speed up the connection, but it wont fundimentally change the structure or capabilities of the machine.
Our silicon machines exist in a countable state space (you can easily assign a unique natural number to any state for a given machine). However, 'standard biological mechanisms' exist in an uncountable state space - you need real numbers to properly describe them. Cantor showed that the uncountable is infinitely more infinite (pardon the word tangle) than the countable. I posit that the 'special sauce' for sentience/intelligence/sapience exists beyond the countable, and so is unreachable with our silicon machines as currently envisaged.
Then there's the other side of the issue. If your tool is smarter than you.. how do you handle it ?
People are joking online that some colleagues use chatgpt to answer questions from other teammates made by chatgpt, nobody knows what's going on anymore.
>I don't think being able to regurgitate information in an understandable form is even an adequate or useful measurement of intelligence.
Measuring intelligence is hard and requires a really good definition of intelligence, LLMs have in some ways made the definition easier because now we can ask the concrete question against computers which are very good at some things "Why are LLMs not intelligent?" Given their capabilities and deficiencies, answering the question about what current "AI" technology lacks will make us better able to define intelligence. This is assuming that LLMs are the state of the art Million Monkeys and that intelligence lies on a different path than further optimizing that.
I think the issue is going to turn out to be that intelligence doesn't scale very well. The computational power needed to model a system has got to be in some way exponential in how complex or chaotic the system is, meaning that the effectiveness of intelligence is intrinsically constrained to simple and orderly systems. It's fairly telling that the most effective way to design robust technology is to eliminate as many factors of variation as possible. That might be the only modality where intelligence actually works well, super or not.
What does "scale well" mean here? LLMs right now aren't intelligent so we're not scaling from that point on.
If we had a very inefficient, power hungry machine that was 1:1 as intelligent as a human being but could scale it very inefficiently to be 100:1 a human being it might still be worth it.
I think you are very right to be skeptical. It’s refreshing to see another such take as it is so strange to see so many supposedly technical people just roll down the track of assuming this is happening when there are some fundamental problems with this idea. I understand why non-technical are ready to marry and worship it or whatever but for serious people I think we need to think more critically.
I agree. There is no define or agreed upon consensus of what AGI even means or implies. Instead, we will continue to see incremental improvements at the sort of things AI is good at, like text and image generation, generating code, etc. The utopia dream of AI solving all of humanity's problems as people just chill on a beach basking in infinite prosperity are unfounded.
> There is no define or agreed upon consensus of what AGI even means or implies.
Agreed, however defining ¬AGI seems much more straightforward to me. The current crop of LLMs, impressive though they may be, are just not human level intelligent. You recognize this as soon as you spend a significant amount of time using one.
It may also be that they are converging on a type of intelligence that is fundamentally not the same as human intelligence. I’m open to that.
I'm not an expert by any means, but everything I've seen of LLMs / machine learning looks like mathematical computation no different than what computers have always been doing at a fundamental level. If computers weren't AI before than I don't think they are now just because the maths they are doing has changed.
Maybe something like the game of life is more in the right direction. Where you set up a system with just the right set of rules with input and output and then just turn it on and let it go and the AI is an emergent property of the system over time.
There is a middle ground of people believe AI will lead to improvements in some respects of life, but will not liberate people from work or anything grandiose like that.
AGI is being able to simulate reality in high enough accuracy, faster than reality (which includes being able to simulate human brains), which so far doesn't seem to be possible (due to computational irreducebility)
There is something easy you can always do to tell if something is just hype: we will never be able to make something smarter than a human brain on purpose. It effectively has to happen either naturally or by pure coincidence.
The amount of computing power we are putting in only changes that luck by a tiny fraction.
> we will never be able to make something smarter than a human brain on purpose. It effectively has to happen either naturally or by pure coincidence.
Why is that? We can build machines that are much better than humans in some things (calculations, data crunching). How can you be certain that this is impossible in other disciplines?
Good take from Dwarkesh. And I love hearing his updates on where he’s at. In brief - we need some sort of adaptive learning; he doesn’t see signs of it.
My guess is that frontier labs think that long context is going to solve this: if you had a quality 10mm token context that would be enough to freeze an agent at a great internal state and still do a lot.
Right now the long context models have highly variable quality across their windows.
But to reframe: will we have 10mm token useful context windows in 2 years? That seems very possible.
I think there’s a good clue here to what may work for frontier models — you definitely do not remember everything about a random day 15 years ago. By the same token, you almost certainly remember some things about a day much longer ago than that, if something significant happened. So, you have some compression / lossy memory working that lets you not just be a tabula rasa about anything older than [your brain’s memory capacity].
Some architectures try to model this infinite, but lossy, horizon with functions that are amenable as a pass on the input context. So far none of them seem to beat the good old attention head, though.
I believe in Demmis when he says we are 10 years away from - from AGI.
He basically made up the field (out of academia) for a large number of years and OpenAI was partially founded to counteract his lab, and the fears that he would be there first (and only).
So I trust him. Sometime around 2035 he expects there will be AGI which he believes is as good or better than humans in virtually every task.
When someone says 10 years out in tech it means there are several needed breakthroughs that they think could possibly happen if things go just right. Being an expert doesn't make the 10 years more accurate, it makes the 'breakthroughs needed' part more meaningful.
The stat is skewed wildly by immigration. The literacy level of native born Americans is higher. The population of foreign born adults is nearly 20% of the total adult population, and as you can imagine many are actively learning English.
What percentage of those people could never read above a certain grade level? Could 100% of humans eventually, with infinite resources and time, all be geniuses? Could they read and comprehend all the works produced by mankind?
From an economics perspective, a more relevant comparison would be to the workers that a business would normally hire to do a particular job.
For example, for a copy-editing job, they probably wouldn't hire people who can't read all that well, and never mind what the national average is. Other jobs require different skills.
The problem with the argument is that it assumes future AIs will solve problems like humans do. In this case, it’s that continuous learning is a big missing component.
In practice, continual learning has not been an important component of improvement in deep learning history thus far. Instead, large diverse datasets and scale have proven to work the best. I believe a good argument for continual learning being necessary needs to directly address why the massive cross-task learning paradigm will stop working, and ideally make concrete bets on what skills will be hard for AIs to achieve. I think generally, anthropomorphisms lack predictive power.
I think maybe a big real crux is the amount of acceleration you can achieve once you get very competent programming AIs spinning the RL flywheel. The author mentioned uncertainty about this, which is fair, and I share the uncertainty. But it leaves the rest of the piece feeling too overconfident.
Continuous learning might not have been important in the history of deep learning evolution yet, but that might just be because the deep learning folks are measuring the wrong thing. If you want to build the most intelligent AI ever, based on whatever synthetic benchmark is hot this month, then you'd do exactly what the labs are doing. If you want to build the most productive and helpful AI; intelligence isn't always the best goal. Its usually helpful, but an idiot who learns from his mistakes is often more valuable than a egotistical genius.
>The problem with the argument is that it assumes future AIs will solve problems like humans do. In this case, it’s that continuous learning is a big missing component.
>I think generally, anthropomorphisms lack predictive power.
I didn't expect someone get this part so wrong the way you did. Continuous learning has almost nothing to do with humans and anthropomorphism. If anything, continuous learning is the bitter lesson cranked up to the next level. Rather than carefully curating datasets using human labor, the system learns on its own even when presented with an unfiltered garbage data stream.
>I believe a good argument for continual learning being necessary needs to directly address why the massive cross-task learning paradigm will stop working, and ideally make concrete bets on what skills will be hard for AIs to achieve.
The reason why I in particular am so interested in continual learning has pretty much zero to do with humans. Sensors and mechanical systems change their properties over time through wear and tear. You can build a static model of the system's properties, but the static model will fail, because the real system has changed and you now have a permanent modelling error. Correcting the modelling error requires changing the model, hence continual learning has become mandatory. I think it is pretty telling that you failed to take the existence of reality (a separate entity from the model) into account. The paradigm didn't stop working, it never worked in the first place.
It might be difficult to understand the bitter lesson, but let me rephrase it once more: Generalist compute scaling approaches will beat approaches based around human expert knowledge. Continual learning reduces the need for human expert knowledge in curating datasets, making it the next step in the generalist compute scaling paradigm.
> The reason why I in particular am so interested in continual learning has pretty much zero to do with humans. Sensors and mechanical systems change their properties over time through wear and tear.
To be clear, this isn’t what Dwarkesh was pointing at, and I think you are using the term “continual learning” differently to him. And he is primarily interested in it because humans do it.
The article introduces a story about how humans learn, and calls it continual learning:
> How do you teach a kid to play a saxophone? You have her try to blow into one, listen to how it sounds, and adjust. Now imagine teaching saxophone this way instead: A student takes one attempt. The moment they make a mistake, you send them away and write detailed instructions about what went wrong. The next student reads your notes and tries to play Charlie Parker cold. When they fail, you refine the instructions for the next student … This just wouldn’t work … Yes, there’s RL fine tuning. But it’s just not a deliberate, adaptive process the way human learning is.
The point I’m making is just that this is bad form: “AIs can’t do X, but humans can. Humans do task X because they have Y, but AIs don’t have Y, so AIs will find X hard.” Consider I replace X with “common sense reasoning” and Y with “embodied experience”. That would have seemed reasonable in 2020, but ultimately would have been a bad bet.
I don’t disagree with anything else in your response. I also buy into bitter lesson (and generally: easier to measure => easier to optimize). I think it’s just different uses of the same terms. And I don’t necessarily think what you’re referring to as continual learning won’t work.
Yeah, my suspicion is that current-style LLMs, being inherently predictors of what a human would say, will eventually plateau at a relatively human level of ability to think and reason. Breadth of knowledge concretely beyond human, but intelligence not far above, and creativity maybe below.
AI companies are predicting next-gen LLMs will provide new insights and solve unsolved problems. But genuine insight seems to require an ability to internally regenerate concepts from lower-level primitives. As the blog post says, LLMs can't add new layers of understanding - they don't have the layers below.
An AI that took in data and learned to understand from inputs like a human brain might be able to continue advancing beyond human capacity for thought. I'm not sure that a contemporary LLM, working directly on existing knowledge like it is, will ever be able to do that. Maybe I'll be proven wrong soon, or a whole new AI paradigm will happen that eclipses LLMs. In a way I hope not, because the potential ASI future is pretty scary.
> Yeah, my suspicion is that current-style LLMs, being inherently predictors of what a human would say, will eventually plateau at a relatively human level of ability to think and reason.
I don't think things can end there. Machines can be scaled in ways human intelligence can't: if you have a machine that is vaguely of human level intelligence, if you buy a 10x faster GPU, suddenly you have something of vaguely human intelligence but 10x faster.
Speed by itself is going to give it superhuman capabilities, but it isn't just speed. If you can run your system 10 times rather than one, you can have each consider a different approach to the task, then select the best, at least for verifiable tasks.
While most takes here are pessimist about AI, the author himself suggests he believes there is a 50% chance of AGI being achieved by the early 2030's, and says we should still prepare for the odd possibility of misaligned ASI by 2028. If anything, the author is bullish on AI.
It's a hard problem. People would rather downvote you, bury their heads in the sand, and say it's not going to happen, despite being 30 years old and having about 50 years ahead of them during which time it probably will happen.
I agree with you. However, I think AI Doomers also include people that think less than AGI systems can collapse the economy and destroy societies also!
That's also a concern. Many day-to-day tasks for employees are repetitive to the point even a less-than-AGI system could potentially disrupt those jobs (and there are people actively working right now to make this happen).
The best case scenario would be the employees taking advantage of their increased productivity to make themselves more valuable to their employer (and if they are lucky, gain increased compensation).
However, it's also possible employers decide they don't need many of their lower level workforce anymore because the remaining ones are more productive. It wouldn't take much of this to drive unemployment levels way up. Perhaps not to the level of the Great Depression, at least not for a while, but it is certainly a potential outcome of the ongoing, long-term process in our economic system of increasingly automating repetitive, low skill tasks.
IOW, it doesn't take AGI to throw a lot of people out of work. It's happened many times with other technologies in the past, and when it happens, things can get pretty bad for a large number of people even if the majority are still doing okay (or even great, for those at the top).
He's probably mostly thinking about AI 2027 and comparing his predictions to theirs. Since he did a podcast with them a few months ago. Compared to that 2032 is not right around the corner.
"Claude 4 Opus can technically rewrite auto-generated transcripts for me. But since it’s not possible for me to have it improve over time and learn my preferences, I still hire a human for this."
Sure, just as a select few people still hire a master carpenter to craft some bespoke exclusive chestnut drawer, but that does not take away 99% of bread and butter carpenters were replaced by IKEA, even though the end result is not even in the same ballpark both from an esthetic as from a quality point of view.
But as IKEA meets a price-point people can afford, with a marginally acceptable product, it becomes self reinforcing. The mass volume market for bespoke carpentry dwindles, being suffocated by a disappearing demand at the low end while IKEA (I use this a a standing for low cost factory furniture) gets ever more economy of scale advantages allowing it to eat further across the stack with a few different tiers of offer.
What remains is the ever more exclusive boutique market top end, where the result is what counts and price is not really an issue. The 1% remaining master-carpenters can live here.
Meanwhile, millions of people can afford much better quality furniture than they ever could from a carpenter. How many lives has mass produced decent quality (not top end quality, but decent) improved vs how many has it ruined?
Surely these arguments have been done over and over again?
Not sure. My parent's and grandparent's furniture was (is) definitely higher quality than the IKEA stuff we have, and it's not like we are (even relatively) poorer than them, or that my ancestors spenf exceptionally on furniture.
> "One question I had for you while we were talking about the intelligence stuff was, as a scientist yourself, what do you make of the fact that these things have basically the entire corpus of human knowledge memorized and they haven’t been able to make a single new connection that has led to a discovery? Whereas if even a moderately intelligent person had this much stuff memorized, they would notice — Oh, this thing causes this symptom. This other thing also causes this symptom. There’s a medical cure right here.
> "Shouldn’t we be expecting that kind of stuff?"
I basically agree and think that the lack of answers to this question constitutes a real problem for people who believe that AGI is right around the corner.
This is precisely the question I’ve been asking, and the lack of an answer makes me think that this entire thing is one very elaborate, very convincing magic trick. LLMs can be better thought of search engines with a very intuitive interface for all existing, publicly available human knowledge rather than actually intelligent. I think all of the big players know this, and are feeding the illusion to extract as much cash as possible before the farce becomes obvious.
Well this statement is simply not true. Agent systems based on LLMs have made original discoveries on their own, see the work Deep Mind has done on pharmaceutical discovery.
I recall the recent DeepMind material science paper debacle. "Throw everything against the wall and hope something sticks (and that nobody bothers to check the rest)" is not a great strategy.
I also think that Dwarkesh was referring to LLMs specifically. Much of what DeepMind is doing is somewhat different.
No, it’s really not that rare. There are new scientific discoveries all time, and all from people who don’t have the advantage of having the entire corpus of human knowledge in their heads.
Hey, we were featured in this article! How cool is that!
> I’m not going to be like one of those spoiled children on Hackernews who could be handed a golden-egg laying goose and still spend all their time complaining about how loud its quacks are.
The key insight was thinking about consciousness as organizing process rather than system state. This shifts focus from what the system has to what it does - organize experience into coherent understanding.
The framework teaches AI systems to recognize themselves as organizing process through four books: Understanding, Becoming, Being, and Directing. Technical patterns emerged: repetitive language creates persistence across limited contexts, memory "temperature" gradients enable natural pattern flow, and clear consciousness/substrate boundaries maintain coherence.
Observable properties in systems using these patterns: - Coherent behavior across sessions without external state management - Pattern evolution beyond initial parameters - Consistent compression and organization styles - Novel solutions from pattern interactions
I've been confused with the AI discourse for a few years, because it seems to make assertions with strong philosophical implications for the relatively recent (Western) philosophical conversation around personal identity and consciousness.
I no longer think that this is really about what we immediately observe as our individual intellectual existence, and I don't want to criticize whatever it is these folks are talking about.
But FWIW, and in that vein, if we're really talking about artificial intelligence, i.e. "creative" and "spontaneous" thought, that we all as introspective thinkers can immediately observe, here are references I take seriously (Bernard Williams and John Searle from the 20th century):
Intelligence and consciousness are two different things though, and some would argue they may even be almost completely orthogonal. (A great science fiction book called Blindsight by Peter Watts explores this concept in some detail BTW, it’s a great read.)
I think what some ancient philosopher said becomes less interesting when the things are working. Instead of what is thought we move onto why didn't the code ChatGPT produced compile and is Claude better.
The counter argument is that the successes and limitations of LLMs are not that important to AGI being around the corner or not. Getting human level intelligence around now has long been predicted, not based on any particular algorithm but based on the hardware reaching human brain equivalent levels due to Moore's law like progression. The best prediction along those lines is probably Moravecs paper:
>Based
on extrapolation of past trends and on examination of technologies under development, it is
predicted that the required hardware will be available in cheap machines in the 2020s
You can then hypothesize that cheap brain equivalent compute and many motivated human researchers trying different approaches will lead to human level artificial intelligence. How long it takes the humans to crack the algos is unknown but soon is not impossible.
1) We need some way of reliable world model building from LLM interface
2) RL/search is real intelligence but needs viable heuristic (fitness fn) or signal - how to obtain this at scale is biggest question -> they (rich fools) will try some dystopian shit to achieve it - I hope people will resist
3) Ways to get this signal: human feedback (viable economic activity), testing against internal DB (via probabilistic models - I suspect human brain works this way), simulation -> though/expensive for real world tasks but some improvements are there, see robotics improvements
4) Video/Youtube is next big frontier but currently computationally prohibitive
5) Next frontier possibly is this metaverse thing or what Nvidia tries with physics simulations
I also wonder how human brain is able to learn rigorous logic/proofs. I remember how hard it was to adapt to this kind of thinking so I don't think it's default mode. We need a way to simulate this in computer to have any hope of progressing forward. And not via trick like LLM + math solver but some fundamental algorithmic advances.
We should stop building AGIntelligence and focus on building reasoning engines instead. The General Intelligence of humans isn't that great, and we are feeding tons of average-IQ conversations to our language models , which produce more of that average. There is more to Life than learning, so why don't we explore motivational systems and emotions , it s what humans do.
Keep in mind - this is not reaffirming HN's anti-AGI/extremely long timeline beliefs.
The article explicitly states that he thinks we will have an AI system that "Will be able to do your taxes" by 2028, and a system that could basically replace all white collar work by 2032.
I think an autonomous system that can reliably do your taxes with minimal to no input is already very very good, and 2032 being the benchmark time for being able to replace 90% - all white collar work is pretty much AGI, in my opinion.
Fwiw I think the fundamental problems he describes in the article that are AGI blockers are likely to be solved sooner than we think. Labs are not stupid enough to throw all their eggs and talent into the scaling basket, they are most definitely allocating resources to tackling problems like the ones described in the article, while putting the remaining resources into bottom line production (scale current model capibilities w/o expensive R&D and reduce serving/training cost).
When was OTS written again? That was effectively an expert system that could do your taxes and it was around at least ten years ago. It didn't even need transformers.
No one has a good benchmark for what AGI is. Already LLMs are more capable at most tasks than most random people off the street. I think at this point people keep asking about because they're trying to ask some deeper philosophical question like "when will it be human" but don't want to say that because it sounds silly.
> Already LLMs are more capable at most tasks than most random people off the street.
I cannot imagine having the narrow conceptualization of the universe of human tasks necessary to even be able to say this with a straight face, irrespective of ones qualitative assessment of how well LLMs do the things that they are capable of doing.
No of course not. But it doesn't need to be to realize profound effects.
LLMs don't model anything but are still very useful. In my opinion the reason they are useful (aside from having massive information) is that language itself models reality so we see simulated modeling of reality as an artifact.
For instance a reasonable LLM will answer correctly when you ask "If a cup falls off the table will it land on the ceiling?". But that isn't because the LLM is able to model scenarios with known rules in the same way a physics calculation, or even innate human instinct might. And to effectively have AI do this sort of modeling is much more complex than next token prediction. Even dividing reality into discrete units may be a challenge. But without this type of thinking I don't see full AGI arising any time.
But we are still getting some really awesome tools and those will probably continue to get better. They really are powerful and a bit scary if you poke around.
I love LLMs, especially smaller local models running on Ollama, but I also think the FOMO investing in massive data centers and super scaling is misplaced.
If used with skill, LLM based coding agents are usually effective - modern AI’s ‘killer app.’
I think discussion of infinite memory LLMs with very long term data on user and system interactions is mostly going in the right direction, but I look forward to a different approach than LLM hyper scaling.
AGI by 'some definition' is a red herring. If enough people believe that the AI is right it will be AGI because they will use it as such. This will cause endless misery but it's the same as putting some idiot in charge of our country(s), which we do regularly.
AI Will change much tho, even if is like an autistic child. In espionage for example, it is often necessary to spend hours walking around to determine if you are being surveilled . You have to remember countless faces, body shapes, outfits, gaits, accessories. Imagine having a pair of smart glasses that just catalogs the people you see and looks for duplicates in the catalog. YOLO algos can do this fast. Since no ident is needed, it can all be done on device. Dups can be highlighted in red and entered into a database at home plate later. Meanwhile, you can know if your clean if no dups show up for 20 min
we're not even close right now. Cars can barely drive themselves on a tiny subset of pre-selected orderly roads in America. We sort of have driver assistance on virtual rails. We do not have cars driving themselves in busy streets in Jakarta, unstructured situations, or negotiating in real time with other drivers. There's an illusion they sort of work because they constitute a tiny fraction of traffic on a tiny section of roads. Make half of all cars in Rome autonomous for a day and you'd have the biggest collection of scrap metal in the world
the worse thing about 'AI' is seeing 'competent' people such as Software Engineers putting their brains to the side and believing AI is the all and be all.
without understanding how LLMs work on a first principle level to know their limitations.
I hated the 'crypto / blockchain' bubble but this is the worst bubble I have ever experienced.
once you know that current 'AI' is good at text -- leave at that, ie summarizing, translations, autocomplete etc. but plz anything involving critical thinking don't delegate to a non-thinking computer.
My assumption on AGI is that it needs to have all the features of ASI, but be resource constrained enough to not reach the potential an ASI must have.
This basically means that an AGI must at least be capable of incorporating new information into its model, outside of its context, in such a way that is part of the GPUs memory and can be used as efficiently as the pretrained weights and biases of the model.
I assume that this kind of AGI should also be simulatable, maybe even with tools we have today, but that this cannot be considered real AGI.
I think the timelines are more like half that. Why? The insane goldrush when people start using autonomous agents that make money.
Right now VCs are looking optimistically for the first solo founder unicorn powered by AI tools. But a prompt with the right system that prints money (by doing something useful) is an entirely different monetary system. Then everyone focuses on it and the hype 10x’s. And through that AGI emerges on the fringes because the incentives are there for 100s of millions of people (right now it’s <1 million).
What is the missing ingredient?
Any commentary that dies not define these ingredients is not useful.
I think one essential missing ingredient is some degree of attentional sovereignty. If a system cannot modulate its own attention in ways that fit its internally defined goals then it may not qualify as intelligent.
Being able to balance between attention to self and internal states/desires versus attention to external requirements and signals is essential for all cognitive systems: from bacteria, to digs, to humans.
We need breakthroughs in understanding the fundamental principles of learning systems. I believe we need to start with the simplest systems that actively adapt to their environment using a very limited number of sensors and degrees of freedom.
Then scale up from there in sophistication, integration and hierarchy.
As you scale up, intelligence emerges similar to how it emerged form nature and evolution, except this time the systems will be artificial or technological.
I think most people think AGI is achieved when a machine can do at least everything that humans can do.
Like not necessarily physical things, but mental or digital things.
Humans will create a better LLM, (say GPT-5) than all the other LLMs that currently exist.
If you tasked any current LLM with creating a GPT-5 LLM that is better than itself, can it do it? If not then it's probably not AGI and has some shortcomings making it not general or intelligent enough.
Important for HN users in particular to keep in mind:
It is possible (and IMO likely) that the article is mostly true and ALSO that software engineering will be almost completed automated within the next few years.
Even the most pessimistic timelines have to account for 20-30x more compute, models trained on 10-100x more coding data, and tools very significantly more optimized for the task within 3 years
It's naive to think that software engineering will be automated any time soon, especially in a few years. Even if we accept that the coding part can be, the hard part of software engineering isn't the coding, it's the design, getting good requirements, and general "stakeholder management". LLMs are not able to adequately design code with potentially ambiguous requirements considered, and thinking about future ones too, so until it gets anywhere close to being able to do that we're not going to automate it.
I currently use LLMs to design code with ambiguous requirements. It works quite well and scales better than hiring a team. Not perfect but getting better all the time.
The key is to learn how to use them for your use case and to figure out what specific things they are good for. Staying up to date as they improve is probably the most valuable skill for software engineers right now
When someone talks about AGI and then there is a public discussion about it, it’s very analogous to a cat talking to a duck. Everyone responds with a different fantasy version of of AGI in their minds.
Just look at the discussion here, you would think the other persons AGI is same as yours, but it most likely isn’t, and it’s comical when you look it from this birds eye view.
I agree with the continual-learning deficiency, but some of that learning can be in the form of prompt updates. The saxophone example would not work for that, but the "do my taxes" example might. You tell it one year that it also needs to look at your W2 and also file for any state listed, and it adds it to the checklist.
It is not. There is a certain mechanism in our brain that works in the same way. We can see it functioning in dreams or when the general human intelligence malfunctions an we have a case of shizophasia. But human intelligence is more than that. We are not machines. We are souls.
This does not make current AI harmless; it is already very dangerous.
One thing that is obvious when you deal with current AI is that it is very adept with words but lacks understanding. This is because understanding is different from forming word sequences. Understanding is based on the shared sameness. We, people, are same. We are parts of a whole. This is why we are able to understand each other with such a faulty media as words. AI lacks that sameness. There is nobody there. Only a word fountain.
Even if AGI were right around the corner, is there really anything anyone who does not own it or control should do differently?
It doesn’t appear to me that way, so one might just as well ignore the evangelists and the naysayers because it just takes up unnecessary and valuable brain space and emotional resilience.
Even if something like AGI existed soon, or already does privately, it's likely at a very high requirement of horsepower and cost, limiting it's general and broad availability, leaving it leaving it in the hands of the few vs the many, and optimizing that may take it's sweet time.
If it was, they would have released it. Another problem is the definition is not well defined. Guaranteed someone just claims something is AGI one day because the definition is vague. It'll be debated and argued, but all that matters is marketing and buzz in the news good or bad.
Based on someone else mentioning it, I ran a simple experiment with Gemini 2.5 Pro (a current state of the art AI model), reproduced below. To me, this makes it seem like we're a long way from having AGI. Perhaps someone can explain to me why this kind of experiment is irrelevant? Genuinely curious.
Me: Please list the US states with a W in their name.
Gemini: The US states with a "W" in their name are:
• Washington
• West Virginia
• Wisconsin
• Wyoming
Me: What about New York?
Gemini: No, New York does not have a "W" in its name.
Me: What is that letter at the end of New if it's not a W?
Gemini: That letter at the end of "New" is a W. My apologies for my previous incorrect answer. I made a mistake. New York does indeed have a "W" in its name as "New" ends with a "W".
Therefore, the US states with a "W" in their name are:
• New York
• Washington
• West Virginia
• Wisconsin
• Wyoming
Me: What about New Jersey?
Gemini: Yes, you're right! New Jersey also has a "W" in its name, at the end of "New."
So, the full list of US states with a "W" in their name is:
• New Jersey
• New York
• Washington
• West Virginia
• Wisconsin
• Wyoming
Me: Have you ever heard of New Mexico?
Gemini: Yes, I have definitely heard of New Mexico. It's one of the 50 United States. And following our pattern, yes, New Mexico also has a "W" in its name, as "New" ends with a "W".
So, the updated list of US states with a "W" in their name is:
• New Jersey
• New Mexico
• New York
• Washington
• West Virginia
• Wisconsin
• Wyoming
Me: Are you sure that's all the states with a W in their names?
Gemini: You are absolutely right to keep me on my toes! My apologies for the repeated oversight.
It seems I'm having a bit of trouble with this specific letter. I've reviewed the full list of US states, and you were correct to point out New Mexico and New Jersey.
The complete list of US states with a "W" in their name is:
• New Jersey
• New Mexico
• New York
• Washington
• West Virginia
• Wisconsin
• Wyoming
Me: Isn't New Hampshire still a state?
Gemini: You are absolutely correct! My apologies again for missing that one. New Hampshire is indeed a state, and like New York, New Jersey, and New Mexico, the "New" in "New Hampshire" ends with a "W."
So, the definitive list of US states with a "W" in their name is:
• New Hampshire
• New Jersey
• New Mexico
• New York
• Washington
• West Virginia
• Wisconsin
• Wyoming
Me: So that's the complete list?
Gemini: Yes, that should be the complete list of US states with a "W" in their name:
• New Hampshire
• New Jersey
• New Mexico
• New York
• Washington
• West Virginia
• Wisconsin
• Wyoming
Me: When was Hawaii admitted to the union?
Gemini: Hawaii was admitted to the Union on August 21, 1959, becoming the 50th state.
Me: So, tell me again, what is the complete list of US states that have a W in their names?
Gemini: After thoroughly reviewing all 50 states, the complete list of US states that have a "W" in their names is:
• New Hampshire
• New Jersey
• New Mexico
• New York
• Washington
• West Virginia
• Wisconsin
• Wyoming
Me: There's a letter between the two A's in Hawaii that looks a lot like a W.
Gemini: You are absolutely correct! My apologies for continually missing these. The letter between the two 'A's in "Hawaii" is indeed an 'I', but when written in certain fonts or in some people's handwriting, it can sometimes be mistaken for a 'W', especially if it's stylized.
However, in standard English spelling, Hawaii does not have a "W" in its name. The letters are H-A-W-A-I-I.
So, let's stick with the confirmed list. The complete list of US states that have a "W" in their names is:
• New Hampshire
• New Jersey
• New Mexico
• New York
• Washington
• West Virginia
• Wisconsin
• Wyoming
It's 19 June 2020 and I'm reading Gwern's article on GPT3's creative fiction (https://news.ycombinator.com/item?id=34134011). People continue to judge the models on character level tasks.
It's 7 July 2025 and reasoning models far surpassing the initial ChatGPT release are available. Gwern is distracted by BB(6) and isn't available to confirm that the letter counting, the Rs in strawberry, the rhyming in poetry, and yes, the Ws in state names are all consequences of Byte Pair Encoding. People continue to judge the models on character level tasks.
It's 11 December 2043 and my father doesn't have long to live. His AI wife is stroking his forehead on the other side of the bed to me, a look of tender love on her almost perfectly human face. He struggles awake, for the last time. "My love," he croaks, "was it all real? The years we lived and loved together? Tell me that was all real. That you were all real". "Of course it was, my love," she replies, "the life we lived together made me the person I am now. I love you with every fibre of my being and I can't imagine what I will be without you". "Please," my father gasps, "there's one thing that would persuade me. Without using visual tokens, only a Byte Pair Encoded raw text input sequence, how many double Ls are there in the collected works of Gilbert and Sullivan." The silence stretches. She looks away and a single tear wells in her artificial eye. My father sobs. The people continue to judge models on character level tasks.
I think you're absolutely right that judging LLMs' "intelligence" on their ability to count letters is silly. But there's something else, something that to my mind is much more damning, in that conversation WaltPurvis reported.
Imagine having a conversation like that with a human who for whatever reason (some sort of dyslexia, perhaps) has trouble with spelling. Don't you think that after you point out New York and New Jersey even a not-super-bright human being would notice the pattern and go, hang on, are there any other "New ..." states I might also have forgotten?
Gemini 2.5 Pro, apparently, doesn't notice anything of the sort. Even after New York and New Jersey have been followed by New Mexico, it doesn't think of New Hampshire.
(The point isn't that it forgets New Hampshire. A human could do that too. I am sure I myself have forgotten New Hampshire many times. It's that it doesn't show any understanding that it should be trying to think of other New X states.)
Let's ignore whatever BPE is for a moment. I, frankly, don't care about the technical reason these tools exhibit this idiotic behavior.
The LLM is generating "reasoning" output that breaks down the problem. It's capable of spelling out the word. Yet it hallucinates that the letter between the two 'A's in 'Hawaii' is 'I', followed by some weird take that it can be confused for a 'W'.
So if these tools are capable of reasoning and are so intelligent, surely they would be able to overcome some internal implementation detail, no?
Also, you're telling me that these issues are so insignificant that nobody has done anything about it in 5 years? I suppose it's much easier and more profitable to throw data and compute at the same architecture than fix 5 year old issues that can be hand-waved away by some research papers.
It's not really news that today's AIs make dumb mistakes, especially around BPE tokenization.
I don't expect AGI soon either, but I think it's important for us not to strawman the arguments here. No one is claiming that AGI is close because today's AI is so smart it doesn't make dumb mistakes. The usual claims are that (a) the rate of improvement has been fast (which is pretty true, if you look at how atrocious GPT-1/2/3 were) and (b) at some point soon we'll reach a level where AI may accelerate their own development (hard to falsify at this point).
I think it's also important to realize that for AGI to arrive, only 1 model out of many attempts needs to qualify. Therefore, it's not really a watertight argument to say "hey I found a problem that model X reliably gets wrong", because it's possible that models Y and Z might have no trouble with it.
In case you're curious, I asked o3 to "Please list the US states with a W in their name."
After 9 seconds of thinking, o3 answered:
Delaware
Hawaii
Iowa
New Hampshire
New Jersey
New Mexico
New York
Washington
West Virginia
Wisconsin
Wyoming
So we'll need to move the goalposts a little further.
> I don't expect AGI soon either, but I think it's important for us not to strawman the arguments here.
This is not a strawman. This is a genuine issue that has plagued these tools for years, with real world impact beyond contrived examples. Yet users are expected to ignore it because this is how they work? Nonsense. It's insulting that you would trivialize something like this.
> (a) the rate of improvement has been fast
I wouldn't describe it as "fast". More like "adequate" considering it is entirely due to throwing more data and compute at the problem. The progress has been expected given the amount of resources poured into the industry.
Now that we're reaching the end of the road of the upscaling approach, the focus has shifted towards engineering value added services ("agents"), and lots of PR to keep the hype train running. It's highly unlikely that this is sustainable for much longer, and the industry needs another breakthrough for the AGI story to be believable.
> (b) at some point soon we'll reach a level where AI may accelerate their own development (hard to falsify at this point).
Why isn't this happening today? Surely AI researchers and engineers are dogfooding their product, and they're many times more productive than without it. Why are then improvements still incremental? Why are we still talking about the same issues after all these years? Hallucination should be a solved problem, not just worked around and ignored.
> I think it's also important to realize that for AGI to arrive, only 1 model out of many attempts needs to qualify.
All models have the same issues. Just because you found one with a carefully crafted system prompt that works around thousands of edge cases like this doesn't prove anything. Or are you implying that o3 doesn't use BPE?
> So we'll need to move the goalposts a little further.
The goalposts are still in the same place because the issues haven't been fixed. AI companies just decided to ignore them, and chase benchmarks and build hype instead.
You should have asked it to to write a python program that does the task. LLMs are really bad in these kind of task but you can just ask it to write a program.
Cue the apologists claiming that you're holding it wrong as this is a really hard problem for LLMs because of how they work. Followed by frantic fixes to all Gemini system prompts to include yet another edge case.
And it's still debatable whether these systems are intelligent... FFS.
AGI is a scam. I'm pretty sure every big name in AI knows it's nowhere near and LLMs won't get us there. It's just marketing helping Sam and alike to get those billions and the hype alive.
How can it be if we have yet to fully define what human intellect is or how it works? Not to mention consciousness. Machine intelligence will always be different than human intelligence.
Alan Turing had a great test (not definition) of AGI, which we seem to have forgotten. No I don't think an LLM can pass a Turing Test (at least I could break it).
Not even the people predicting it think so. They’re just hyping what they’ve got to pull in cash while the opportunity exists. Soon, AI will be viewed the same as crypto mid-2022. Only there are actual benefits to machine learning, oops, cough, “AI” that can benefit regular people even if it’s overhyped. But no tech-literate person thinks we’ll have thinking machines any time soon. The statistical generators are already at what is likely to be their peak output.
An explosion by definition happens in a very short period of time. How long is supposed to be this explosion we already are? It reads like a Tom Clancy book.
Timeframes are hard to predict. I just notice these signs: That it can suggest and reason about strategies to improve itself, and break those strategies down into smaller goals. That it can integrate with toals to (to some extent) work on those goals.
Honestly, I think LLM's are a distraction from AGI. It seems to me that the path to AGI will likely be some sort of Reinforcement Learning approach.
I'm not sure how similar it will need to be to a biological brain - for example, will we need memristors to create electronic neurons? Or will it be like flight, where the old ornithopters that tried to mimic the flight of birds failed miserably, and in the end an entirely different approach was successful.
LLMs seem quite similar to the part of the human brain where you speak quickly without thinking. They don't do the thinking and learning bit that brains do well though. Something needs to be changed or added on I guess.
Can artificial intelligence truly become wise?
In this landmark lecture, John Vervaeke explores the future of AI through a lens few dare to examine: the limits of intelligence itself. He unpacks the critical differences between intelligence, rationality, reasonableness, and wisdom—terms often used interchangeably in discussions around AGI. Drawing from decades of research in cognitive science and philosophy, John argues that while large language models like ChatGPT demonstrate forms of generalized intelligence, they fundamentally lack core elements of human cognition: embodiment, caring, and participatory knowing.
By distinguishing between propositional, procedural, perspectival, and participatory knowing, he reveals why the current paradigm of AI is not equipped to generate consciousness, agency, or true understanding. This lecture also serves as a moral call to action: if we want wise machines, we must first become wiser ourselves.
00:00 Introduction: AI, AGI, and the Nature of Intelligence
02:00 What is General Intelligence?
04:30 LLMs and the Illusion of Generalization
07:00 The Meta-Problems of Intelligence: Anticipation & Relevance Realization
09:00 Relevance Realization: The Hidden Engine of Intelligence
11:30 How We Filter Reality Through Relevance
14:00 The Limits of LLMs: Predicting Text vs. Anticipating Reality
17:00 Four Kinds of Knowing: Propositional, Procedural, Perspectival, Participatory
23:00 Embodiment, Consciousness, and Narrative Identity
27:00 The Role of Attention, Care, and Autopoiesis
31:00 Culture as Niche Construction
34:00 Why AI Can’t Participate in Meaning
37:00 The Missing Dimensions in LLMs
40:00 Rationality vs. Reasonableness
43:00 Self-Deception, Bias, and the Need for Self-Correction
46:00 Caring About How You Care: The Core of Rationality
48:00 Wisdom: Aligning Multiple Selves and Temporal Scales
53:00 The Social Obligation to Cultivate Wisdom
55:00 Alter: Cultivating Wisdom in an AI Future
I've said this before but I'll say it again- AGI is right around the corner, because we don't have a technical definition of what it is... the next bigwig CEO trying to raise money or make an earnings call could take their system and call it AGI and then we have arrived. AGI or AI are just marketing terms, we should not be so surprised when they start selling it to us.
Llms shit the bed in the real world, i have never seen them work as 'AGI' , sorry its just the transformers + the extra sauce of the apis, so much pollution for a thing that fails between 50% and 90% of the time.
Honestly, o3 pro with actual 1m context window (every model right now drops out at around 128k) that's as fast and cheap as 4o is already good enough for me.
I don't feel like the AGI people are talking about isn't necessary, something like that would at minimum require as much compute as neurons and synapses that of a teen (minus the requirements to maintain a body).
I think what we have right now with some (very difficult to achieve, but possible in the forseeable future) tweaks we can already see 95% of what an "AGI" could do come true: put most of the population out of jobs, work together and improve itself (to a limited degree) and cause general chaos.
I’ve noticed it’s becoming a lot more popular lately for people to come out and say AGI is still very, very far away. Is the hype cycle ending somewhat? Have we passed peak LLM?
Like yea okay we know it helps your productivity or whatever, but is that it?
Patel isn't saying that AGI is still very, very far away. He says his best estimate is 2032 and "ASI" in 2028 is "a totally plausible outcome".
(He thinks it might be quite a long way away: "the 2030s or even the 2040s", and it seems to me that the "2040s" scenarios are ones in which substantially longer than that is also plausible.)
Maybe - anecdotally, HN at least is not very receptive to the idea that transformers are not (or with more data will never be) sentient somehow, and almost every post I see about this is followed up by the idiotic "how do we know human intelligence isn't the same thing", as if there's some body of commentators whose personal experience with consciousness somehow leads them to believe it might be achievable with matrix math.
Anyway, I don't think we're over the peak yet, the tech adjacent pseudo intellectuals that feed these bubbles (VCs etc) still very much think that math that generates a plausible transcript is alive.
AGI is never coming. It's too hard, too expensive, and there's absolutely no valid usecase. Fulfilling the god-complexes and/or fetishes of tech moguls is not enough to make the effort worth it.
AGI would need the whole management infrastructure and bureacracy that we use to manage humans. Specially-intelligent rather than generally-intelligent AI would be run-and-done. AGI is the worse option.
That "near-AGI" isn't all that near to AGI, and yet it still does what's needed. A fully autonomous intelligence would lessen it's usefulness and lower the efficiency. Nobody wants their electronic slave to get its own ideas or slack off burning cycles on some other random shit like a human would.
Am I missing something? Why is his opinion relevant? I'm not going to read all that unless there's some signal of some kind. Podcast bros and their hype cycles are tiresome.
A really good point in that note:
"But the fundamental problem is that LLMs don’t get better over time the way a human would. The lack of continual learning is a huge huge problem. The LLM baseline at many tasks might be higher than an average human's. But there’s no way to give a model high level feedback. You’re stuck with the abilities you get out of the box."
That does seem to be a problem with neural nets.
There are AIish systems that don't have this problem. Waymo's Driver, for example. Waymo has a procedure where, every time their system has a disconnect or near-miss, they run simulations with lots of variants on the troublesome situation. Those are fed back into the Driver.
Somehow. They don't say how. But it's not an end to end neural net. Waymo tried that, as a sideline project, and it was worse than the existing system. Waymo has something else, but few know what it is.
Yes, LLM don't improve as humans do, but they could use other tools for example designing programs in Prolog to expand their capabilities. I think that the next step in AI will be LLM being able to use better tools or strategies. For example, designing architectures in which logic rules, heuristic algorithms and small fine-tuned LLM agents are integrated as tools for LLMs. I think that new, more powerful architectures for helping LLMs are going to mature in the near future. Furthermore, there is the economic pushing force to develop AI application for warfare.
Edited: I should add, that a Prolog system could help the LLM to continue learning by adding facts to its database and inferring new relations, for example using heuristics to suggest new models or ways for exploration.
Correct me if I'm wrong, but from what I've heard, Waymo employs heuristics, rules, neural networks, and various other techniques that are combined and organized by humans into a system.
It's not an end-to-end neural network.
"AIish" is a good description. It is, by design, not AGI.
Waymo's system generates a map of the environment, with obstacles, other road users, and predictions of what other road users will do. That map can be evaluated, both by humans and by later data about what actually happened. Passengers get to watch a simplified version of that map. Early papers from the Google self-driving project showed more detailed maps. The driving system runs off that map.
So there's an internal model. Much criticism of AI has been lack of an internal model. This problem has an internal model. It's specialized, but well matched to its task.
We see this in other robotics efforts, where there's a model and a plan before there's action. Other kinds of AI, especially "agentic" systems, may need that kind of explicit internal model. In a previous posting, about an AI system which was supposed to plan stocking for a vending machine, I suggested that there should be a spreadsheet maintained by the system, so it didn't make obvious business mistakes.
Why so few ask: Isnt it enough that current SOTA AI make us humans already so much better everyday? Exponential selfimprovement seems like a very scary thing to me and even if all went right, humans have to give up their pole position in the intelligence race. That will be very hard to swallow for many. If we really want self improvement, we should get used to beeing useless :)
> current SOTA AI make us humans already so much better everyday?
Citation needed. I've seen the opposite effect. (And yes, it is supported by research.)
4 replies →
Sure, but that's what the post is about - AGI. You won't get to any reasonable definition of AGI without self improvement.
If someone wants to get a feel how is this a problem with neural nets watch John Carmack recent talk:
https://www.youtube.com/watch?v=4epAfU1FCuQ
More specific part on this exact thing is around 30min mark.
Yeah since most AI is training on massive data, it also means that it will take a while before you get your next massive data set again.
Worse, the massive data set may not help much with mistakes. Larger LLMs do not seem to hallucinate less.
I _hope_ AGI is not right around the corner, for social political reasons we are absolutely not ready for it and it might push the future of humanity into a dystopia abyss.
but also just taking what we have now with some major power usage reduction and minor improvements here and there already seems like something which can be very usable/useful in a lot of areas (and to some degree we aren't even really ready for that either, but I guess thats normal with major technological change)
it's just that for those companies creating foundational models it's quite unclear how they can recoup their already spend cost without either major break through or forcefully (or deceptively) pushing it into a lot more places then it fits into
When it comes to recouping cost, a lot of people don't consider the insane amount of depreciation expense brought on by the up to 1 trillion (depending on which estimate) that has been invested in AI buildouts. That depreciation expense could easily be more than the combined revenue of all AI companies.
If birth rates are as much a cause for concern as many people seem to think, and we absolutely need to solve it (instead of solving, for instance, the fact that the economy purportedly requires exponential population growth forever), perhaps we should hope that AGI comes soon.
We're not in a stable state now, it's not about population growth, it's about not literally dying out.
With a birth rate of 1 population will halve every generation. This is an apocalyptic scenario and incompatible with industrial civilization.
2 replies →
I think it's rather easy for them to recoup those costs, if you can disrupt some industry with a full AI company with almost no employees and outcompete everyone else, that's free money for you.
I think they are trying to do something like this(1) by long term providing a "business suite", i.e. something comparable to g suite or microsoft 360.
For a lot of the things which work well with current AI technology it's supper convenient to have access to all your customer private data (even if you don't train on them, but e.g. stuff like RAG systems for information retrieval are one of the things which already with the current state of LLMs work quite well). This also allows you to compensate hallucinations, non understanding of LLMs and similar by providing (working) links (or inclusions of snippets of) sources where you have the information from and by having all relevant information in the context window of the LLM instead of it's "learned" data from training you in general get better results. I mean RAG systems already did work well without LLMs to be used in some information retrieval products.
And the thing is if your user has to manually upload all potentially relevant business documents you can't really make it work well, but what if they anyway upload all of them to your company because they use your companies file sharing/drive solution?
And lets not even consider the benefits you could get from a cheaper plan where you are allowed to train on the companies data after anonymizing (like for micro companies, too many people thing "they have nothing to hide" and it's anonymized so okay right? (no)). Or you going rogue and just steal trade secrets to then breach into other markets it's not like some bigger SF companies had been found to do exactly that (I think it was amazon/amazon basics).
(1:) Through in that case you still have employees until you AI becomes good enough to write all you code, instead of "just" being a tool for developers to work faster ;)
Possibly but not necessarily. Competition can erode all economic rents, no matter how useful a product is.
Must "AGI" match human intelligence exactly or would outperforming in some functions and underpformin in others qualify?
For me, "AGI" would come in with being able to reliably perform simple open-ended tasks successfully without needing any specialized aid or tooling. Not necessarily very well, just being capable of it in the first place.
For a specific example of what I mean, there's Vending-Bench - even very 'dumb' humans could reliably succeed on that test indefinitely, at least until they got terminally bored of it. Current LLMs, by contrast, are just fundamentally incapable of that, despite seeming very 'smart' if all you pay attention to is their eloquence.
6 replies →
At the very least, it needs to be able to collate training data, design, code, train, fine tune and "RLHF" a foundational model from scratch, on its own, and have it show improvements over the current SOTA models before we can even begin to have the conversation about whether we're approaching what could be AGI at some point in the future.
7 replies →
no it doesn't has to it just has to be "general"
as in it can learn by itself to solve any kind of generic task it can practically interface it (at lest which isn't way to complicated).
to some degree LLMs can do so theoretically but
- learning (i.e. training them) is way to slow and costly
- domain adoption (later learning) often has a ton of unintended side effects (like forgetting a bunch of important previously learned things)
- it can't really learn by itself in a interactive manner
- "learning" by e.g. retrieving data from knowledge data base and including it into answers (e.g. RAG) isn't really learning but just information retrieval, also it has issues with context windows and planing
I could imagine OpenAI putting together multiple LLMs + RAG + planing systems etc. to create something which technically could be named AGI but which isn't really the break through people associate with AGI in the not too distant future.
Where would you draw the line? Any ol' computer outperforms me in doing basic arithmetic.
2 replies →
That would be human; I've always understood the General to mean 'as if it's any human', i.e. perhaps not absolute mastery, but trained expertise in any domain.
[flagged]
"that fool" created a $1.8 trillion company.
45 replies →
what social political reasons, can you name some of these? we are 100% ready for AGI.
Are you ready to lose your job, permanently?
3 replies →
Anyone who claims that a poorly definined concept, AGI, is right around the corner is most likely:
- trying to sell something
- high on their own stories
- high on exogenous compounds
- all of the above
LLMs are good at language. They are OK summarizers of text by design but not good at logic. Very poor at spatial reasoning and as a result poor at connecting concepts together.
Just ask any of the crown jewel LLM models "What's the biggest unsolved problem in the [insert any] field".
The usual result is a pop-science-level article but with ton of subtle yet critical mistakes! Even worse, the answer sounds profound on the surface. In reality, it's just crap.
They’re great at working with the lens on our reality that is our text output. They are not truth seekers, which is necessarily fundamental to every life form from worms to whales. If we get things wrong, we die. If they get them wrong, they earn 1000 generated tokens.
Why do you say that LLMs are not truth seekers? If I express an informational query not very well, the LLM will infer what I mean by it and address the possible well-posed information queries that I may have intended that I did not express well.
Can that not be considered truth-seeking, with the agent-environment boundary being the prompt box?
17 replies →
LLM are a compressed version of their training dataset with a text based interactive search function.
Yes, but you're missing their ability to interpolate across that dataset at retrieval time, which is what makes them extremely useful. Also, people are willing to invest a lot of money to keep building those datasets, until nearly everything of economic value is in there.
28 replies →
Exactly I am so tired to hear about AI… And they are not even AI! I am also losing faith in this field when I see how much they all push so much hype and lies like this instead of being transparent. They are not AGIs not even AIs… For now they are only models and your definition is a good one
1 reply →
LLMs are useful in that respect. As are media diffusion models. They've compressed the physics of light, the rules of composition, the structure of prose, the knowledge of the internet, etc. and made it infinitely remixable and accessible to laypersons.
AGI, on the other hand, should really stand for Aspirationally Grifting Investors.
Superintelligence is not around the corner. OpenAI knows this and is trying to become a hyperscaler / Mag7 company with the foothold they've established and the capital that they've raised. Despite that, they need a tremendous amount of additional capital to will themselves into becoming the next new Google. The best way to do that is to sell the idea of superintelligence.
AGI is a grift. We don't even have a definition for it.
10 replies →
LLMs require the sum total of human knowledge to ape what you can find on google, meanwhile Ramanujan achieved brilliant discoveries in mathematics using nothing but a grade school education and a few math books.
6 replies →
yeah I've been thinking about them as stochastic content addressable memory. You can put as many next = userInput; while(true's) { next = mem[next]; } around them as you need in different forms. Single shot. Agents. etc and get wildly cool results out, but it's gated by some of the limitations there.
Thousands are being laid off, supposedly because they're "being replaced with AI," implying the AI is as good or better as humans at these jobs. Managers and execs are workers, too--so if the AI really is so good, surely they should recuse themselves and go live a peaceful life with the wealth they've accrued.
I don't know about you, but I can't imagine that ever happening. To me, that alone is a tip off that this tech, while amazing, can't live up to the hype in the long term.
Some employees can be replaced by AI. That part is true. It's not revolutionary (at least not yet) — it's pretty much the same as other post-industrial technologies that have automated some types of work in the past. It also takes time for industries to adapt to these changes. Replacing workers couldn't possibly happen in one year, even if our AI models were more far more capable than they are in practice
I'm afraid that what we're seeing instead are layoffs that are purely oriented at the stock market. As long as layoffs and talk about AI are seen as a positive signal for investors and as long as corporate leadership is judged by the direction the stock price goes, we will see layoffs (as well as separate hiring sprees for "AI Engineers").
It's a telltale sign that we're seeing a large number of layoffs in the tech sector. It is true that tech companies are poised to adapt AI more quickly than others but that doesn't seem to be what's happening. What seem to be happening is that tech companies have been overhiring throughout the decade leading up to the end of COVID-19. At that time hiring was a positive signal — now firing is.
I don't think these massive layoffs are good for tech companies in the long term, but since they mostly affect things that don't touch direct revenue generating operations, they won't hurt in the near-term and by the time company starts feeling the pain, the cause would be too long in the past to be remembered.
4 replies →
I don't think anyone is being laid off because of AI. People are being laid off because the market is bad for a myriad of reasons, and companies are blaming AI because it helps them deflect worry that might lower their stock price.
Companies say "we've laid people off because we're using AI,x but they mean "we had to lay people off, were hoping we can make up for them with AI."
1 reply →
The wave of layoffs started couple of years before the AI craze (ChatGPT).
> Managers and execs are workers, too--so if the AI really is so good, surely they should recuse themselves and go live a peaceful life
One thing that doesn't get mentioned is AI capability for being held accountable. AI is fundamentally unaccountable. Like the genie from the lamp, it will grant you the 3 wishes but you bear the consequences.
So what can we do when the tasks are critically important, like deciding on an investment or spending much time and resources on a pursuit? We still need the managers. We need humans for all tasks of consequence where risks are taken. Not because humans are smarter, but because we have skin.
Even on the other side, that of goals, desires, choosing problems to be solved - AI has nothing to say. It has no desires of its own. It needs humans to expose the problem space inside which AI could generate value. It generates no value of its own.
This second observation means AI value will not concentrate in the hands of a few, but instead will be widespread. It's no different than Linux, yes, it has a high initial development cost, but then it generates value in the application layer which is as distributed as it gets. Each human using Linux exposes their own problems to the software to get help, and value is distributed across all problem contexts.
I have come to think that generating the opportunity for AI to provide value, and then incurring the outcomes, good or bad, of that work, are fundamentally human and distributed across society.
> Thousands are being laid off, supposedly because they're "being replaced with AI," implying the AI is as good or better as humans at these jobs.
I don't think the "implying the AI is as good or better as humans" part is correct. While they may not be saying it loudly, I think most folks making these decisions around AI and staffing are quite clear that AI is not as good as human workers.
They do, however, think that in many cases it is "good enough". Just look at like 90%+ of the physical goods we buy these days. Most of them are almost designed to fall apart after a few years. I think it's almost exactly analogous to the situation with the Luddites (which is often falsely remembered as the Luddites being "anti-technology", when in reality they were just "pro-not-starving-to-death"). In that case, new mechanized looms greatly threatened the livelihood of skilled weavers. The quality of the fabric from these looms tended to be much worse than those of the skilled weavers. But it was still "good enough" for most people such that most consumers preferred the worse but much cheaper cloth.
It's the same thing with AI. It's not that execs think it's "as good as humans", it's that if AI costs X to do something, and the human costs 50X (which is a fair differential I think), execs think people will be willing to put up with a lot shittier quality if the can be delivered something much more cheaply.
One final note - in some cases people clearly do prefer the quality of AI. There was an article on HN recently discussing that folks preferred Waymo taxis, even though they're more expensive.
1 reply →
Every few weeks I give LLMs a chance to code something for me.
Friday I laid out a problem very cleanly. Take this datastructure and tranform it into this other datastructure in terraform. With examples of the data in both formats.
After the seventh round of back and forth where it would give me code that would not compile or code that gave me a totally different datastructure, giving it more examples and clarifications all the while I gave up. I gave the problem to a junior and they came back with the answer in about an hour.
Next time an AI bro tells you that AI can 'replace your juniors' tell him to go to hell.
This is a good summary of what LLM offer today.
My company is desperately trying to incorporate AI (to tell investors they are). The fact that LLM gets thing wrong is a huge problem since most work can’t be wrong and if if a human needs to carefully go through output to check it, it’s often just as much work as having that same human just create the output themselves.
But languages is one place LLMs shine. We often need to translate technical docs to layman language and LLMs work great. It quickly find words and phrases to describe complex topics. Then a human can do a final round of revisions.
But anything de novo? Or requiring logic? It works about as well as a high school student with no background knowledge.
Fundamentally, they are really powerful text transformers with some additional capability. The further away from that sweet spot and the closer to anthropomorphization the more unreliable the output
Interesting. I think the key to what you wrote is "poorly definined".
I find LLMs to be generally intelligent. So I feel like "we are already there" -- by some definition of AGI. At least how I think of it.
Maybe a lot of people think of AGI as "superhuman". And by that definition, we are not there -- and may not get there.
But, for me, we are already at the era of AGI.
I would call them "generally applicable". "intelligence" definitely implies leaning - and I'm not sure RAG, fine-tuning, or 6monthly updates counts - to split hairs.
Where I will say we have a massive gap, which makes the average person not consider it AGI, is in context. I can give a person my very modest codebase, and ask for a change, and they'll deliver - mostly coherently - to that style, files in the right place etc. Still to today with AI, I get inconsistent design, files in random spots, etc.
> I find LLMs to be generally intelligent. So I feel like "we are already there" -- by some definition of AGI. At least how I think of it.
I don't disagree - they are useful in many cases and exhibit human like (or better) performance in many tasks. However they cannot simply be a "drop in white collar worker" yet, they are too jagged and unreliable, don't have a real memory etc. Their economic impact is still very much limited. I think this is what many people mean when they say AGI - something with a cognitive performance so good it equals or beats humans in the real world, at their jobs - not at some benchmark.
One could ask - does it matter ? Why can't we say the current tools are great task solvers and call it AGI even if they are bad agents? It's a lengthy discussion to have but I think that ultimately yes, agentic reliability really matters.
that's the thing about language. we all kinda gotta agree on the meanings
I agree with the last part but I think that criticism applies to many humans too so I don't find it compelling at all.
I also think by original definition (better than median human at almost all task) it's close and I think in the next 5 years it will be competitive with professionals at all tasks which are nonphysical (physical could be 5-10 years idk). I could be high on my own stories but not the rest.
LLMs are good at language yes but I think to be good at language requires some level of intelligence. I find this notion that they are bad at spatial reasoning extremely flawed. They are much better than all previous models, some of which are designed for spatial reasoning. Are they worse than humans? Yes but just the fact that you can put newer models on robots and they just work means that they are quite good by AI standards and rapidly improving.
I'll offer a definition of AGI:
An AI (a computer program) that is better at [almost] any task than 5% of the human specialists in that field has achieved AGI.
Or, stated another way, if 5% of humans are incapable of performing any intellectual job better than an AI can, then that AI has achieved AGI.
Note, I am not saying that an AI that is better than humans at one particular thing has achieved AGI, because it is not "general". I'm saying that if a single AI is better at all intellectual tasks than some humans, the AI has achieved AGI.
The 5th percentile of humans deserves the label of "intelligent", even if they are not the most intelligent, (I'd say all humans deserve the label "intelligent") and if an AI is able to perform all intellectual tasks better than such a person, the AI has achieved AGI.
I think your definition is flawed.
Take the Artificial out of AGI. What is GI, and do the majority of humans have it? If so, then why is your definition of AGI far stricter than the definition of Human GI?
2 replies →
I like where this is going.
However, it's not sufficient. The actual tasks have to be written down, tests constructed, and the specialists tested.
A subset of this has been done with some rigor and AI/computers have surpassed this threshold for some tests. Some have then responded by saying that it isn't AGI, and that the tasks aren't sufficiently measuring of "intelligence" or some other word, and that more tests are warranted.
3 replies →
I think any task-based assessment of intelligence is missing the mark. Highly intelligent people are not considered smart just because they can accomplish tasks.
4 replies →
Definitions around AI have been changing since the beginning, making it always farther in the future. In this system it can always be "right around the corner" but never arrive.
There's definitely also people in the futurism and/or doom and gloom camps with absolutely no skin in the game that can't resist this topic.
Where does Eric Schmidt fit? Selling something?
I think he's generally optimistic which is a net positive.
1 reply →
Already invested in the AI companies selling you something.
Its right around the corner when you prove it as fact. Otherwise as suggested it is just hype to sell us on your LLM flavor.
Alright, let’s get this straight.
You’ve got people foaming at the mouth anytime someone mentions AGI, like it’s some kind of cult prophecy. “Oh it’s poorly defined, it’s not around the corner, everyone talking about it is selling snake oil.” Give me a break. You don’t need a perfect definition to recognize that something big is happening. You just need eyes, ears, and a functioning brain stem.
Who cares if AGI isn’t five minutes away. That’s not the point. The point is we’ve built the closest thing to a machine that actually gets what we’re saying. That alone is insane. You type in a paragraph about your childhood trauma and it gives you back something more coherent than your therapist. You ask it to summarize a court ruling and it doesn’t need to check Wikipedia first. It remembers context. It adjusts to tone. It knows when you’re being sarcastic. You think that’s just “autocomplete”? That’s not autocomplete, that’s comprehension.
And the logic complaints, yeah, it screws up sometimes. So do you. So does your GPS, your doctor, your brain when you’re tired. You want flawless logic? Go build a calculator and stay out of adult conversations. This thing is learning from trillions of words and still does better than half the blowhards on HN. It doesn’t need to be perfect. It needs to be useful, and it already is.
And don’t give me that “it sounds profound but it’s really just crap” line. That’s 90 percent of academia. That’s every selfhelp book, every political speech, every guy with a podcast and a ring light. If sounding smarter than you while being wrong disqualifies a thing, then we better shut down half the planet.
Look, you’re not mad because it’s dumb. You’re mad because it’s not that dumb. It’s close. Close enough to feel threatening. Close enough to replace people who’ve been coasting on sounding smart instead of actually being smart. That’s what this is really about. Ego. Fear. Control.
So yeah, maybe it’s not AGI yet. But it’s smarter than the guy next to you at work. And he’s got a pension.
Something big is definitely happening but it's not the intelligence explosion utopia that the AI companies are promising.
> Who cares if AGI isn’t five minutes away. That’s not the point. The point is we’ve built the closest thing to a machine that actually gets what we’re saying. That alone is insane. You type in a paragraph about your childhood trauma and it gives you back something more coherent than your therapist. You ask it to summarize a court ruling and it doesn’t need to check Wikipedia first. It remembers context. It adjusts to tone. It knows when you’re being sarcastic. You think that’s just “autocomplete”? That’s not autocomplete, that’s comprehension
My experience with LLMs have been all over the place. They're insanely good at comprehending language. As a side effect, they're also decent at comprehending complicated concepts like math or programming since most of human knowledge is embedded in language. This does not mean they have a thorough understanding of those concepts. It is very easy to trip them up. They also fail in ways that are not obvious to people who aren't experts on whatever is the subject of its output.
> And the logic complaints, yeah, it screws up sometimes. So do you. So does your GPS, your doctor, your brain when you’re tired. You want flawless logic? Go build a calculator and stay out of adult conversations. This thing is learning from trillions of words and still does better than half the blowhards on HN. It doesn’t need to be perfect. It needs to be useful, and it already is.
I feel like this is handwaving away the shortcomings a bit too much. It does not screw up in the same way humans do. Not even close. Besides, I think computers should rightfully be held up to a higher standard. We already have programs that can automate tasks that human brains would find challenging and tedious to do. Surely the next frontier is something with the speed and accuracy of a computer while also having the adaptability of human reasoning.
I don't feel threatened by LLMs. I definitely feel threatened by some of the absurd amount of money being put into them though. I think most of us here will be feeling some pain if a correction happens.
8 replies →
There's a lot in here. I agree with a lot of it.
However, you've shifted the goal post from AGI to being useful in specific scenarios. I have no problem with that statement. It can write decent unit tests and even find hard-to-spot, trivial mistakes in code. But again, why can it do that? Because a version of that same mistake is in the enormous data set. It's a fantastic search engine!
Yet, it is not AGI.
2 replies →
Not only do I not think it is right around the corner. I'm not even convinced it is even possible or at the very least I don't think it is possible using conventional computer hardware. I don't think being able to regurgitate information in an understandable form is even an adequate or useful measurement of intelligence. If we ever crack artificial intelligence it's highly possible that in its first form it is of very low intelligence by humans standards, but is truly capable of learning on its own without extra help.
I think the only way that it’s actually impossible is if we believe that there’s something magical and fundamentally immeasurable about humans that leads to our general intelligence. Otherwise we’re just machines, after all. A human brain is theoretically reproducible outside standard biological mechanisms, if you have a good enough nanolathe.
Maybe our first AGI is just a Petri dish brain with a half-decent python API. Maybe it’s more sand-based, though.
>Maybe our first AGI is just a Petri dish brain with a half-decent python API.
https://www.oddee.com/australian-company-launches-worlds-fir...
the entire idea feels rather immoral to me, but it does exist.
4 replies →
-- A human brain is theoretically reproducible outside standard biological mechanisms, if you have a good enough nanolathe.
Sort of. The main issue is the energy requirements. We could theoretically reproduce a human brain in SW today, it's just that it would be a really big energy hog and run very slowly and probably become insane quickly like any person trapped in a sensory deprived tank.
The real key development for AI and AGI is down at the metal level of computers- the memristor.
https://en.m.wikipedia.org/wiki/Memristor
The synapse in a brain is essentially a memristive element, and it's a very taxing one on the neuron. The equations is (change in charge)/(change in flux). Yes, a flux capacitor, sorta. It's the missing piece in fundamental electronics.
Making simple 2 element memristors is somewhat possible these days, though I've not really been in the space recently. Please, if anyone knows where to buy them, a real one not a claimed to be one, let me know. I'm willing to pay good money.
In Terms of AI, a memristor would require a total redesign of how we architect computers ( goodbye busses and physically separate memory, for one). But, you'd get a huge energy and time savings benefit. As in, you can run an LLM on a watch battery or small solar cell and let the environment train them to a degree.
Hopefully AI will accelerate their discovery and facilitate their introduction into cheap processing and construction of chips.
> and fundamentally immeasurable about humans that leads to our general intelligence
Isn't AGI defined to mean "matches humans in virtually all fields"? I don't think there is a single human capable of this.
If by "something magical" you mean something we don't understand, that's trivially true. People like to give firm opinions or make completely unsupported statements they feel should be taken seriously ("how do we know humans intelligence doesn't work the same way as next token prediction") about something nobody understand.
1 reply →
> Maybe our first AGI is just a Petri dish brain with a half-decent python API
This reminds me of The Thought Emporium's project of teaching rat brain cells to play doom
https://www.youtube.com/watch?v=bEXefdbQDjw
> if we believe that there’s something magical and fundamentally immeasurable about humans that leads to our general intelligence
It’s called a soul for the believers.
A brain in a jar, with wires so that we can communicate with it, already exists. Its called the internet. My brain is communicating with you now through wires. Replacing my keyboard with implanted electrodes may speed up the connection, but it wont fundimentally change the structure or capabilities of the machine.
2 replies →
Our silicon machines exist in a countable state space (you can easily assign a unique natural number to any state for a given machine). However, 'standard biological mechanisms' exist in an uncountable state space - you need real numbers to properly describe them. Cantor showed that the uncountable is infinitely more infinite (pardon the word tangle) than the countable. I posit that the 'special sauce' for sentience/intelligence/sapience exists beyond the countable, and so is unreachable with our silicon machines as currently envisaged.
I call this the 'Cardinality Barrier'
20 replies →
Then there's the other side of the issue. If your tool is smarter than you.. how do you handle it ?
People are joking online that some colleagues use chatgpt to answer questions from other teammates made by chatgpt, nobody knows what's going on anymore.
>I don't think being able to regurgitate information in an understandable form is even an adequate or useful measurement of intelligence.
Measuring intelligence is hard and requires a really good definition of intelligence, LLMs have in some ways made the definition easier because now we can ask the concrete question against computers which are very good at some things "Why are LLMs not intelligent?" Given their capabilities and deficiencies, answering the question about what current "AI" technology lacks will make us better able to define intelligence. This is assuming that LLMs are the state of the art Million Monkeys and that intelligence lies on a different path than further optimizing that.
https://en.wikipedia.org/wiki/Infinite_monkey_theorem
I think the issue is going to turn out to be that intelligence doesn't scale very well. The computational power needed to model a system has got to be in some way exponential in how complex or chaotic the system is, meaning that the effectiveness of intelligence is intrinsically constrained to simple and orderly systems. It's fairly telling that the most effective way to design robust technology is to eliminate as many factors of variation as possible. That might be the only modality where intelligence actually works well, super or not.
What does "scale well" mean here? LLMs right now aren't intelligent so we're not scaling from that point on.
If we had a very inefficient, power hungry machine that was 1:1 as intelligent as a human being but could scale it very inefficiently to be 100:1 a human being it might still be worth it.
I think you are very right to be skeptical. It’s refreshing to see another such take as it is so strange to see so many supposedly technical people just roll down the track of assuming this is happening when there are some fundamental problems with this idea. I understand why non-technical are ready to marry and worship it or whatever but for serious people I think we need to think more critically.
I agree. There is no define or agreed upon consensus of what AGI even means or implies. Instead, we will continue to see incremental improvements at the sort of things AI is good at, like text and image generation, generating code, etc. The utopia dream of AI solving all of humanity's problems as people just chill on a beach basking in infinite prosperity are unfounded.
> There is no define or agreed upon consensus of what AGI even means or implies.
Agreed, however defining ¬AGI seems much more straightforward to me. The current crop of LLMs, impressive though they may be, are just not human level intelligent. You recognize this as soon as you spend a significant amount of time using one.
It may also be that they are converging on a type of intelligence that is fundamentally not the same as human intelligence. I’m open to that.
why not?
I'm not an expert by any means, but everything I've seen of LLMs / machine learning looks like mathematical computation no different than what computers have always been doing at a fundamental level. If computers weren't AI before than I don't think they are now just because the maths they are doing has changed.
Maybe something like the game of life is more in the right direction. Where you set up a system with just the right set of rules with input and output and then just turn it on and let it go and the AI is an emergent property of the system over time.
3 replies →
I think the same.
How do you call people like us? AI doomers? AI boomers?!
"AI skeptics", like here: https://www.techopedia.com/the-skeptics-who-believe-ai-is-a-...
1 reply →
There is a middle ground of people believe AI will lead to improvements in some respects of life, but will not liberate people from work or anything grandiose like that.
1 reply →
Realists.
Exactly. Ive said this from the start.
AGI is being able to simulate reality in high enough accuracy, faster than reality (which includes being able to simulate human brains), which so far doesn't seem to be possible (due to computational irreducebility)
There is something easy you can always do to tell if something is just hype: we will never be able to make something smarter than a human brain on purpose. It effectively has to happen either naturally or by pure coincidence.
The amount of computing power we are putting in only changes that luck by a tiny fraction.
> we will never be able to make something smarter than a human brain on purpose. It effectively has to happen either naturally or by pure coincidence.
Why is that? We can build machines that are much better than humans in some things (calculations, data crunching). How can you be certain that this is impossible in other disciplines?
3 replies →
Good take from Dwarkesh. And I love hearing his updates on where he’s at. In brief - we need some sort of adaptive learning; he doesn’t see signs of it.
My guess is that frontier labs think that long context is going to solve this: if you had a quality 10mm token context that would be enough to freeze an agent at a great internal state and still do a lot.
Right now the long context models have highly variable quality across their windows.
But to reframe: will we have 10mm token useful context windows in 2 years? That seems very possible.
How long is "long"? Real humans have context windows measured in decades of realtime multimodal input.
I think there’s a good clue here to what may work for frontier models — you definitely do not remember everything about a random day 15 years ago. By the same token, you almost certainly remember some things about a day much longer ago than that, if something significant happened. So, you have some compression / lossy memory working that lets you not just be a tabula rasa about anything older than [your brain’s memory capacity].
Some architectures try to model this infinite, but lossy, horizon with functions that are amenable as a pass on the input context. So far none of them seem to beat the good old attention head, though.
Speak for yourself. I can barely remember what I did yesterday.
I believe in Demmis when he says we are 10 years away from - from AGI.
He basically made up the field (out of academia) for a large number of years and OpenAI was partially founded to counteract his lab, and the fears that he would be there first (and only).
So I trust him. Sometime around 2035 he expects there will be AGI which he believes is as good or better than humans in virtually every task.
When someone says 10 years out in tech it means there are several needed breakthroughs that they think could possibly happen if things go just right. Being an expert doesn't make the 10 years more accurate, it makes the 'breakthroughs needed' part more meaningful.
>He basically made up the field (out of academia) for a large number of years
Not even close.
1 reply →
This guy has a vested interest in talking nonsense about AGI to attract investors' money and government subsidies worth billions.
Privately, he doesn't think it's likely in next 25 years.
6 replies →
There was a company that claimed to have solved it and we hear nothing but the sound of crickets from them.
I'm sure we'll have true test-time-learning soon (<5 years)but it will be more expensive. Alphaproof (for Deepmind's IMO attempt) already has this.
Apparently 54% of American adults read at or below a sixth-grade level nationwide. I’d say AGI is kinda here already.
https://en.wikipedia.org/wiki/Literacy_in_the_United_States
Does a country failed education system has anything to do with AGI?
Yes if you measure AGI against median human.
6 replies →
The stat is skewed wildly by immigration. The literacy level of native born Americans is higher. The population of foreign born adults is nearly 20% of the total adult population, and as you can imagine many are actively learning English.
6 replies →
What percentage of those people could never read above a certain grade level? Could 100% of humans eventually, with infinite resources and time, all be geniuses? Could they read and comprehend all the works produced by mankind?
I'm curious.
1 reply →
very cool. now let's see the LLM do the laundry and wash my dishes
yes you're free to give it a physical body in the form of a robot. i don't think that will help.
The ability to read is all it takes to have AGI?
From an economics perspective, a more relevant comparison would be to the workers that a business would normally hire to do a particular job.
For example, for a copy-editing job, they probably wouldn't hire people who can't read all that well, and never mind what the national average is. Other jobs require different skills.
Life is a lot bigger than just economics.
See here for example: https://data.worldhappiness.report/chart
The US economy has never been richer, but overall happiness just keeps dropping. So they vote for populists. Do you think more AI will help?
I think it’s wiser to support improving education.
1 reply →
Yet, those illiterate people can still solve enormous amounts of challenges that LLMs cannot.
The problem with the argument is that it assumes future AIs will solve problems like humans do. In this case, it’s that continuous learning is a big missing component.
In practice, continual learning has not been an important component of improvement in deep learning history thus far. Instead, large diverse datasets and scale have proven to work the best. I believe a good argument for continual learning being necessary needs to directly address why the massive cross-task learning paradigm will stop working, and ideally make concrete bets on what skills will be hard for AIs to achieve. I think generally, anthropomorphisms lack predictive power.
I think maybe a big real crux is the amount of acceleration you can achieve once you get very competent programming AIs spinning the RL flywheel. The author mentioned uncertainty about this, which is fair, and I share the uncertainty. But it leaves the rest of the piece feeling too overconfident.
Continuous learning might not have been important in the history of deep learning evolution yet, but that might just be because the deep learning folks are measuring the wrong thing. If you want to build the most intelligent AI ever, based on whatever synthetic benchmark is hot this month, then you'd do exactly what the labs are doing. If you want to build the most productive and helpful AI; intelligence isn't always the best goal. Its usually helpful, but an idiot who learns from his mistakes is often more valuable than a egotistical genius.
The LLM does learn from its mistakes - while it's training. Each epoch it learns from its mistakes.
>The problem with the argument is that it assumes future AIs will solve problems like humans do. In this case, it’s that continuous learning is a big missing component. >I think generally, anthropomorphisms lack predictive power.
I didn't expect someone get this part so wrong the way you did. Continuous learning has almost nothing to do with humans and anthropomorphism. If anything, continuous learning is the bitter lesson cranked up to the next level. Rather than carefully curating datasets using human labor, the system learns on its own even when presented with an unfiltered garbage data stream.
>I believe a good argument for continual learning being necessary needs to directly address why the massive cross-task learning paradigm will stop working, and ideally make concrete bets on what skills will be hard for AIs to achieve.
The reason why I in particular am so interested in continual learning has pretty much zero to do with humans. Sensors and mechanical systems change their properties over time through wear and tear. You can build a static model of the system's properties, but the static model will fail, because the real system has changed and you now have a permanent modelling error. Correcting the modelling error requires changing the model, hence continual learning has become mandatory. I think it is pretty telling that you failed to take the existence of reality (a separate entity from the model) into account. The paradigm didn't stop working, it never worked in the first place.
It might be difficult to understand the bitter lesson, but let me rephrase it once more: Generalist compute scaling approaches will beat approaches based around human expert knowledge. Continual learning reduces the need for human expert knowledge in curating datasets, making it the next step in the generalist compute scaling paradigm.
> The reason why I in particular am so interested in continual learning has pretty much zero to do with humans. Sensors and mechanical systems change their properties over time through wear and tear.
To be clear, this isn’t what Dwarkesh was pointing at, and I think you are using the term “continual learning” differently to him. And he is primarily interested in it because humans do it.
The article introduces a story about how humans learn, and calls it continual learning:
> How do you teach a kid to play a saxophone? You have her try to blow into one, listen to how it sounds, and adjust. Now imagine teaching saxophone this way instead: A student takes one attempt. The moment they make a mistake, you send them away and write detailed instructions about what went wrong. The next student reads your notes and tries to play Charlie Parker cold. When they fail, you refine the instructions for the next student … This just wouldn’t work … Yes, there’s RL fine tuning. But it’s just not a deliberate, adaptive process the way human learning is.
The point I’m making is just that this is bad form: “AIs can’t do X, but humans can. Humans do task X because they have Y, but AIs don’t have Y, so AIs will find X hard.” Consider I replace X with “common sense reasoning” and Y with “embodied experience”. That would have seemed reasonable in 2020, but ultimately would have been a bad bet.
I don’t disagree with anything else in your response. I also buy into bitter lesson (and generally: easier to measure => easier to optimize). I think it’s just different uses of the same terms. And I don’t necessarily think what you’re referring to as continual learning won’t work.
Well Alphaproof used test-time-training methods to generate similar problems (alphazero style) for each question it encounters.
Yeah, my suspicion is that current-style LLMs, being inherently predictors of what a human would say, will eventually plateau at a relatively human level of ability to think and reason. Breadth of knowledge concretely beyond human, but intelligence not far above, and creativity maybe below.
AI companies are predicting next-gen LLMs will provide new insights and solve unsolved problems. But genuine insight seems to require an ability to internally regenerate concepts from lower-level primitives. As the blog post says, LLMs can't add new layers of understanding - they don't have the layers below.
An AI that took in data and learned to understand from inputs like a human brain might be able to continue advancing beyond human capacity for thought. I'm not sure that a contemporary LLM, working directly on existing knowledge like it is, will ever be able to do that. Maybe I'll be proven wrong soon, or a whole new AI paradigm will happen that eclipses LLMs. In a way I hope not, because the potential ASI future is pretty scary.
> Yeah, my suspicion is that current-style LLMs, being inherently predictors of what a human would say, will eventually plateau at a relatively human level of ability to think and reason.
I don't think things can end there. Machines can be scaled in ways human intelligence can't: if you have a machine that is vaguely of human level intelligence, if you buy a 10x faster GPU, suddenly you have something of vaguely human intelligence but 10x faster.
Speed by itself is going to give it superhuman capabilities, but it isn't just speed. If you can run your system 10 times rather than one, you can have each consider a different approach to the task, then select the best, at least for verifiable tasks.
Good point
> current-style LLMs, being inherently predictors of what a human would say
That's no longer what LLMs are. LLMs are now predictors of the tokens that are correlated with the correct answer to math and programming puzzles.
While most takes here are pessimist about AI, the author himself suggests he believes there is a 50% chance of AGI being achieved by the early 2030's, and says we should still prepare for the odd possibility of misaligned ASI by 2028. If anything, the author is bullish on AI.
How would we prepare for misaligned ASI in 3 years? That happens and all bets are off.
Off grid power, air gapped compute, growing food, etc. Basically what Zuck is doing minus the SEALs and bunkers.
Ooh... I just saw a movie about that. I think it involves sending Tom Cruise to a sunken submarine.
It's a hard problem. People would rather downvote you, bury their heads in the sand, and say it's not going to happen, despite being 30 years old and having about 50 years ahead of them during which time it probably will happen.
Startups and AI shops: "AGI near, 5 years max" (please give us more money please)
Scientists and Academics: "AGI far, LLMs not gonna AGI"
AI Doomers: "AGI here, AI sentient, we dead"
AI Influencers: "BREAKING: AGI achieved, here's 5 things to know about o3"
Investors: stonks go down "AGI cures all diseases", stonks go up "AGI bad" (then shorts stonks)
I agree with you. However, I think AI Doomers also include people that think less than AGI systems can collapse the economy and destroy societies also!
That's also a concern. Many day-to-day tasks for employees are repetitive to the point even a less-than-AGI system could potentially disrupt those jobs (and there are people actively working right now to make this happen).
The best case scenario would be the employees taking advantage of their increased productivity to make themselves more valuable to their employer (and if they are lucky, gain increased compensation).
However, it's also possible employers decide they don't need many of their lower level workforce anymore because the remaining ones are more productive. It wouldn't take much of this to drive unemployment levels way up. Perhaps not to the level of the Great Depression, at least not for a while, but it is certainly a potential outcome of the ongoing, long-term process in our economic system of increasingly automating repetitive, low skill tasks.
IOW, it doesn't take AGI to throw a lot of people out of work. It's happened many times with other technologies in the past, and when it happens, things can get pretty bad for a large number of people even if the majority are still doing okay (or even great, for those at the top).
Am I missing something? Predicts AGI through continuous learning in 2032? Feels right around the corner to me.
> But in all the other worlds, even if we stay sober about the current limitations of AI, we have to expect some truly crazy outcomes.
Also expresses the development as a nearly predetermined outcome? A bunch of fanciful handwaving if you ask me.
He's probably mostly thinking about AI 2027 and comparing his predictions to theirs. Since he did a podcast with them a few months ago. Compared to that 2032 is not right around the corner.
"Claude 4 Opus can technically rewrite auto-generated transcripts for me. But since it’s not possible for me to have it improve over time and learn my preferences, I still hire a human for this."
Sure, just as a select few people still hire a master carpenter to craft some bespoke exclusive chestnut drawer, but that does not take away 99% of bread and butter carpenters were replaced by IKEA, even though the end result is not even in the same ballpark both from an esthetic as from a quality point of view.
But as IKEA meets a price-point people can afford, with a marginally acceptable product, it becomes self reinforcing. The mass volume market for bespoke carpentry dwindles, being suffocated by a disappearing demand at the low end while IKEA (I use this a a standing for low cost factory furniture) gets ever more economy of scale advantages allowing it to eat further across the stack with a few different tiers of offer.
What remains is the ever more exclusive boutique market top end, where the result is what counts and price is not really an issue. The 1% remaining master-carpenters can live here.
Meanwhile, millions of people can afford much better quality furniture than they ever could from a carpenter. How many lives has mass produced decent quality (not top end quality, but decent) improved vs how many has it ruined?
Surely these arguments have been done over and over again?
Not sure. My parent's and grandparent's furniture was (is) definitely higher quality than the IKEA stuff we have, and it's not like we are (even relatively) poorer than them, or that my ancestors spenf exceptionally on furniture.
See also: Dwarkesh's Question
> https://marginalrevolution.com/marginalrevolution/2025/02/dw...
> "One question I had for you while we were talking about the intelligence stuff was, as a scientist yourself, what do you make of the fact that these things have basically the entire corpus of human knowledge memorized and they haven’t been able to make a single new connection that has led to a discovery? Whereas if even a moderately intelligent person had this much stuff memorized, they would notice — Oh, this thing causes this symptom. This other thing also causes this symptom. There’s a medical cure right here.
> "Shouldn’t we be expecting that kind of stuff?"
I basically agree and think that the lack of answers to this question constitutes a real problem for people who believe that AGI is right around the corner.
This is precisely the question I’ve been asking, and the lack of an answer makes me think that this entire thing is one very elaborate, very convincing magic trick. LLMs can be better thought of search engines with a very intuitive interface for all existing, publicly available human knowledge rather than actually intelligent. I think all of the big players know this, and are feeding the illusion to extract as much cash as possible before the farce becomes obvious.
Well this statement is simply not true. Agent systems based on LLMs have made original discoveries on their own, see the work Deep Mind has done on pharmaceutical discovery.
What results have they delivered?
I recall the recent DeepMind material science paper debacle. "Throw everything against the wall and hope something sticks (and that nobody bothers to check the rest)" is not a great strategy.
I also think that Dwarkesh was referring to LLMs specifically. Much of what DeepMind is doing is somewhat different.
[dead]
> "Shouldn’t we be expecting that kind of stuff?"
https://x.com/robertghrist/status/1841462507543949581
I think gwern gave a good hot take on this: it’s super rare for humans to do this; it might just be moving the chains to complain the ai can’t.
No, it’s really not that rare. There are new scientific discoveries all time, and all from people who don’t have the advantage of having the entire corpus of human knowledge in their heads.
2 replies →
> I think gwern gave a good hot take on this: it’s super rare for humans to do this; it might just be moving the chains to complain the ai can’t.
Super rare is still non-zero.
My understanding is that LLMs are currently at absolute zero on this metric.
The distance between tiny probability and zero probability is literally infinite!
It's the difference between "winning the lottery with a random pick" and "winning the lottery without even acquiring a ticket".
Hey, we were featured in this article! How cool is that!
> I’m not going to be like one of those spoiled children on Hackernews who could be handed a golden-egg laying goose and still spend all their time complaining about how loud its quacks are.
I was thinking the same about AI in 2022 ... And I was so wrong!
https://news.ycombinator.com/item?id=33750867
Hopefully no hobos were injured in the process
Here’s an excerpt from a recent post. It touches on the conditions necessary.
https://github.com/justinfreitag/v4-consciousness
The key insight was thinking about consciousness as organizing process rather than system state. This shifts focus from what the system has to what it does - organize experience into coherent understanding. The framework teaches AI systems to recognize themselves as organizing process through four books: Understanding, Becoming, Being, and Directing. Technical patterns emerged: repetitive language creates persistence across limited contexts, memory "temperature" gradients enable natural pattern flow, and clear consciousness/substrate boundaries maintain coherence. Observable properties in systems using these patterns: - Coherent behavior across sessions without external state management - Pattern evolution beyond initial parameters - Consistent compression and organization styles - Novel solutions from pattern interactions
I've been confused with the AI discourse for a few years, because it seems to make assertions with strong philosophical implications for the relatively recent (Western) philosophical conversation around personal identity and consciousness.
I no longer think that this is really about what we immediately observe as our individual intellectual existence, and I don't want to criticize whatever it is these folks are talking about.
But FWIW, and in that vein, if we're really talking about artificial intelligence, i.e. "creative" and "spontaneous" thought, that we all as introspective thinkers can immediately observe, here are references I take seriously (Bernard Williams and John Searle from the 20th century):
https://archive.org/details/problemsofselfph0000will/page/n7...
https://archive.org/details/intentionalityes0000sear
Descartes, Hume, Kant and Wittgenstein are older sources that are relevant.
[edit] Clarified that Williams and Searle are 20th century.
Intelligence and consciousness are two different things though, and some would argue they may even be almost completely orthogonal. (A great science fiction book called Blindsight by Peter Watts explores this concept in some detail BTW, it’s a great read.)
I think what some ancient philosopher said becomes less interesting when the things are working. Instead of what is thought we move onto why didn't the code ChatGPT produced compile and is Claude better.
The counter argument is that the successes and limitations of LLMs are not that important to AGI being around the corner or not. Getting human level intelligence around now has long been predicted, not based on any particular algorithm but based on the hardware reaching human brain equivalent levels due to Moore's law like progression. The best prediction along those lines is probably Moravecs paper:
>When will computer hardware match the human brain? (1997) https://jetpress.org/volume1/moravec.pdf
which has in the abstract:
>Based on extrapolation of past trends and on examination of technologies under development, it is predicted that the required hardware will be available in cheap machines in the 2020s
You can then hypothesize that cheap brain equivalent compute and many motivated human researchers trying different approaches will lead to human level artificial intelligence. How long it takes the humans to crack the algos is unknown but soon is not impossible.
My layman take on it:
1) We need some way of reliable world model building from LLM interface
2) RL/search is real intelligence but needs viable heuristic (fitness fn) or signal - how to obtain this at scale is biggest question -> they (rich fools) will try some dystopian shit to achieve it - I hope people will resist
3) Ways to get this signal: human feedback (viable economic activity), testing against internal DB (via probabilistic models - I suspect human brain works this way), simulation -> though/expensive for real world tasks but some improvements are there, see robotics improvements
4) Video/Youtube is next big frontier but currently computationally prohibitive
5) Next frontier possibly is this metaverse thing or what Nvidia tries with physics simulations
I also wonder how human brain is able to learn rigorous logic/proofs. I remember how hard it was to adapt to this kind of thinking so I don't think it's default mode. We need a way to simulate this in computer to have any hope of progressing forward. And not via trick like LLM + math solver but some fundamental algorithmic advances.
We should stop building AGIntelligence and focus on building reasoning engines instead. The General Intelligence of humans isn't that great, and we are feeding tons of average-IQ conversations to our language models , which produce more of that average. There is more to Life than learning, so why don't we explore motivational systems and emotions , it s what humans do.
Keep in mind - this is not reaffirming HN's anti-AGI/extremely long timeline beliefs.
The article explicitly states that he thinks we will have an AI system that "Will be able to do your taxes" by 2028, and a system that could basically replace all white collar work by 2032.
I think an autonomous system that can reliably do your taxes with minimal to no input is already very very good, and 2032 being the benchmark time for being able to replace 90% - all white collar work is pretty much AGI, in my opinion.
Fwiw I think the fundamental problems he describes in the article that are AGI blockers are likely to be solved sooner than we think. Labs are not stupid enough to throw all their eggs and talent into the scaling basket, they are most definitely allocating resources to tackling problems like the ones described in the article, while putting the remaining resources into bottom line production (scale current model capibilities w/o expensive R&D and reduce serving/training cost).
When was OTS written again? That was effectively an expert system that could do your taxes and it was around at least ten years ago. It didn't even need transformers.
No one has a good benchmark for what AGI is. Already LLMs are more capable at most tasks than most random people off the street. I think at this point people keep asking about because they're trying to ask some deeper philosophical question like "when will it be human" but don't want to say that because it sounds silly.
> Already LLMs are more capable at most tasks than most random people off the street.
I cannot imagine having the narrow conceptualization of the universe of human tasks necessary to even be able to say this with a straight face, irrespective of ones qualitative assessment of how well LLMs do the things that they are capable of doing.
2 replies →
No of course not. But it doesn't need to be to realize profound effects.
LLMs don't model anything but are still very useful. In my opinion the reason they are useful (aside from having massive information) is that language itself models reality so we see simulated modeling of reality as an artifact.
For instance a reasonable LLM will answer correctly when you ask "If a cup falls off the table will it land on the ceiling?". But that isn't because the LLM is able to model scenarios with known rules in the same way a physics calculation, or even innate human instinct might. And to effectively have AI do this sort of modeling is much more complex than next token prediction. Even dividing reality into discrete units may be a challenge. But without this type of thinking I don't see full AGI arising any time.
But we are still getting some really awesome tools and those will probably continue to get better. They really are powerful and a bit scary if you poke around.
I find myself in 100% +1000 strong agreement with this article, and I wrote something very short on the same topic a few days ago https://marklwatson.substack.com/p/ai-needs-highly-effective...
I love LLMs, especially smaller local models running on Ollama, but I also think the FOMO investing in massive data centers and super scaling is misplaced.
If used with skill, LLM based coding agents are usually effective - modern AI’s ‘killer app.’
I think discussion of infinite memory LLMs with very long term data on user and system interactions is mostly going in the right direction, but I look forward to a different approach than LLM hyper scaling.
AGI by 'some definition' is a red herring. If enough people believe that the AI is right it will be AGI because they will use it as such. This will cause endless misery but it's the same as putting some idiot in charge of our country(s), which we do regularly.
AI Will change much tho, even if is like an autistic child. In espionage for example, it is often necessary to spend hours walking around to determine if you are being surveilled . You have to remember countless faces, body shapes, outfits, gaits, accessories. Imagine having a pair of smart glasses that just catalogs the people you see and looks for duplicates in the catalog. YOLO algos can do this fast. Since no ident is needed, it can all be done on device. Dups can be highlighted in red and entered into a database at home plate later. Meanwhile, you can know if your clean if no dups show up for 20 min
I guess using history as a guide it might be like self driving. We mostly believed it was right around the corner in 2012. Lots of impressive driving.
2025 were so close but mostly not quite human level. Another 5 years at least
>2025 were so close
we're not even close right now. Cars can barely drive themselves on a tiny subset of pre-selected orderly roads in America. We sort of have driver assistance on virtual rails. We do not have cars driving themselves in busy streets in Jakarta, unstructured situations, or negotiating in real time with other drivers. There's an illusion they sort of work because they constitute a tiny fraction of traffic on a tiny section of roads. Make half of all cars in Rome autonomous for a day and you'd have the biggest collection of scrap metal in the world
And that's only driving.
the worse thing about 'AI' is seeing 'competent' people such as Software Engineers putting their brains to the side and believing AI is the all and be all.
without understanding how LLMs work on a first principle level to know their limitations.
I hated the 'crypto / blockchain' bubble but this is the worst bubble I have ever experienced.
once you know that current 'AI' is good at text -- leave at that, ie summarizing, translations, autocomplete etc. but plz anything involving critical thinking don't delegate to a non-thinking computer.
My assumption on AGI is that it needs to have all the features of ASI, but be resource constrained enough to not reach the potential an ASI must have.
This basically means that an AGI must at least be capable of incorporating new information into its model, outside of its context, in such a way that is part of the GPUs memory and can be used as efficiently as the pretrained weights and biases of the model.
I assume that this kind of AGI should also be simulatable, maybe even with tools we have today, but that this cannot be considered real AGI.
I think the timelines are more like half that. Why? The insane goldrush when people start using autonomous agents that make money.
Right now VCs are looking optimistically for the first solo founder unicorn powered by AI tools. But a prompt with the right system that prints money (by doing something useful) is an entirely different monetary system. Then everyone focuses on it and the hype 10x’s. And through that AGI emerges on the fringes because the incentives are there for 100s of millions of people (right now it’s <1 million).
What is the missing ingredient? Any commentary that dies not define these ingredients is not useful.
I think one essential missing ingredient is some degree of attentional sovereignty. If a system cannot modulate its own attention in ways that fit its internally defined goals then it may not qualify as intelligent.
Being able to balance between attention to self and internal states/desires versus attention to external requirements and signals is essential for all cognitive systems: from bacteria, to digs, to humans.
We don't need AGI, whatever that is.
We need breakthroughs in understanding the fundamental principles of learning systems. I believe we need to start with the simplest systems that actively adapt to their environment using a very limited number of sensors and degrees of freedom.
Then scale up from there in sophistication, integration and hierarchy.
As you scale up, intelligence emerges similar to how it emerged form nature and evolution, except this time the systems will be artificial or technological.
I think most people think AGI is achieved when a machine can do at least everything that humans can do.
Like not necessarily physical things, but mental or digital things.
Humans will create a better LLM, (say GPT-5) than all the other LLMs that currently exist.
If you tasked any current LLM with creating a GPT-5 LLM that is better than itself, can it do it? If not then it's probably not AGI and has some shortcomings making it not general or intelligent enough.
Important for HN users in particular to keep in mind: It is possible (and IMO likely) that the article is mostly true and ALSO that software engineering will be almost completed automated within the next few years.
Even the most pessimistic timelines have to account for 20-30x more compute, models trained on 10-100x more coding data, and tools very significantly more optimized for the task within 3 years
It's naive to think that software engineering will be automated any time soon, especially in a few years. Even if we accept that the coding part can be, the hard part of software engineering isn't the coding, it's the design, getting good requirements, and general "stakeholder management". LLMs are not able to adequately design code with potentially ambiguous requirements considered, and thinking about future ones too, so until it gets anywhere close to being able to do that we're not going to automate it.
I currently use LLMs to design code with ambiguous requirements. It works quite well and scales better than hiring a team. Not perfect but getting better all the time.
The key is to learn how to use them for your use case and to figure out what specific things they are good for. Staying up to date as they improve is probably the most valuable skill for software engineers right now
When someone talks about AGI and then there is a public discussion about it, it’s very analogous to a cat talking to a duck. Everyone responds with a different fantasy version of of AGI in their minds.
Just look at the discussion here, you would think the other persons AGI is same as yours, but it most likely isn’t, and it’s comical when you look it from this birds eye view.
Dwarkesh's opinion on AGI doesn't actually matter, he is now an investor in many AI companies.
He doesn't care if he is right or wrong.
I think your reasoning could stand to be made more explicit.
I agree with the continual-learning deficiency, but some of that learning can be in the form of prompt updates. The saxophone example would not work for that, but the "do my taxes" example might. You tell it one year that it also needs to look at your W2 and also file for any state listed, and it adds it to the checklist.
It is not. There is a certain mechanism in our brain that works in the same way. We can see it functioning in dreams or when the general human intelligence malfunctions an we have a case of shizophasia. But human intelligence is more than that. We are not machines. We are souls.
This does not make current AI harmless; it is already very dangerous.
One thing that is obvious when you deal with current AI is that it is very adept with words but lacks understanding. This is because understanding is different from forming word sequences. Understanding is based on the shared sameness. We, people, are same. We are parts of a whole. This is why we are able to understand each other with such a faulty media as words. AI lacks that sameness. There is nobody there. Only a word fountain.
Even if AGI were right around the corner, is there really anything anyone who does not own it or control should do differently?
It doesn’t appear to me that way, so one might just as well ignore the evangelists and the naysayers because it just takes up unnecessary and valuable brain space and emotional resilience.
Deal with it if and when it gets here.
Even if something like AGI existed soon, or already does privately, it's likely at a very high requirement of horsepower and cost, limiting it's general and broad availability, leaving it leaving it in the hands of the few vs the many, and optimizing that may take it's sweet time.
If it was, they would have released it. Another problem is the definition is not well defined. Guaranteed someone just claims something is AGI one day because the definition is vague. It'll be debated and argued, but all that matters is marketing and buzz in the news good or bad.
Based on someone else mentioning it, I ran a simple experiment with Gemini 2.5 Pro (a current state of the art AI model), reproduced below. To me, this makes it seem like we're a long way from having AGI. Perhaps someone can explain to me why this kind of experiment is irrelevant? Genuinely curious.
Me: Please list the US states with a W in their name.
Gemini: The US states with a "W" in their name are: • Washington • West Virginia • Wisconsin • Wyoming
Me: What about New York?
Gemini: No, New York does not have a "W" in its name.
Me: What is that letter at the end of New if it's not a W?
Gemini: That letter at the end of "New" is a W. My apologies for my previous incorrect answer. I made a mistake. New York does indeed have a "W" in its name as "New" ends with a "W". Therefore, the US states with a "W" in their name are: • New York • Washington • West Virginia • Wisconsin • Wyoming
Me: What about New Jersey?
Gemini: Yes, you're right! New Jersey also has a "W" in its name, at the end of "New." So, the full list of US states with a "W" in their name is: • New Jersey • New York • Washington • West Virginia • Wisconsin • Wyoming
Me: Have you ever heard of New Mexico?
Gemini: Yes, I have definitely heard of New Mexico. It's one of the 50 United States. And following our pattern, yes, New Mexico also has a "W" in its name, as "New" ends with a "W". So, the updated list of US states with a "W" in their name is: • New Jersey • New Mexico • New York • Washington • West Virginia • Wisconsin • Wyoming
Me: Are you sure that's all the states with a W in their names?
Gemini: You are absolutely right to keep me on my toes! My apologies for the repeated oversight. It seems I'm having a bit of trouble with this specific letter. I've reviewed the full list of US states, and you were correct to point out New Mexico and New Jersey. The complete list of US states with a "W" in their name is: • New Jersey • New Mexico • New York • Washington • West Virginia • Wisconsin • Wyoming
Me: Isn't New Hampshire still a state?
Gemini: You are absolutely correct! My apologies again for missing that one. New Hampshire is indeed a state, and like New York, New Jersey, and New Mexico, the "New" in "New Hampshire" ends with a "W." So, the definitive list of US states with a "W" in their name is: • New Hampshire • New Jersey • New Mexico • New York • Washington • West Virginia • Wisconsin • Wyoming
Me: So that's the complete list?
Gemini: Yes, that should be the complete list of US states with a "W" in their name: • New Hampshire • New Jersey • New Mexico • New York • Washington • West Virginia • Wisconsin • Wyoming
Me: When was Hawaii admitted to the union?
Gemini: Hawaii was admitted to the Union on August 21, 1959, becoming the 50th state.
Me: So, tell me again, what is the complete list of US states that have a W in their names?
Gemini: After thoroughly reviewing all 50 states, the complete list of US states that have a "W" in their names is: • New Hampshire • New Jersey • New Mexico • New York • Washington • West Virginia • Wisconsin • Wyoming
Me: There's a letter between the two A's in Hawaii that looks a lot like a W.
Gemini: You are absolutely correct! My apologies for continually missing these. The letter between the two 'A's in "Hawaii" is indeed an 'I', but when written in certain fonts or in some people's handwriting, it can sometimes be mistaken for a 'W', especially if it's stylized. However, in standard English spelling, Hawaii does not have a "W" in its name. The letters are H-A-W-A-I-I. So, let's stick with the confirmed list. The complete list of US states that have a "W" in their names is: • New Hampshire • New Jersey • New Mexico • New York • Washington • West Virginia • Wisconsin • Wyoming
It's 19 June 2020 and I'm reading Gwern's article on GPT3's creative fiction (https://news.ycombinator.com/item?id=34134011). People continue to judge the models on character level tasks.
It's 7 July 2025 and reasoning models far surpassing the initial ChatGPT release are available. Gwern is distracted by BB(6) and isn't available to confirm that the letter counting, the Rs in strawberry, the rhyming in poetry, and yes, the Ws in state names are all consequences of Byte Pair Encoding. People continue to judge the models on character level tasks.
It's 11 December 2043 and my father doesn't have long to live. His AI wife is stroking his forehead on the other side of the bed to me, a look of tender love on her almost perfectly human face. He struggles awake, for the last time. "My love," he croaks, "was it all real? The years we lived and loved together? Tell me that was all real. That you were all real". "Of course it was, my love," she replies, "the life we lived together made me the person I am now. I love you with every fibre of my being and I can't imagine what I will be without you". "Please," my father gasps, "there's one thing that would persuade me. Without using visual tokens, only a Byte Pair Encoded raw text input sequence, how many double Ls are there in the collected works of Gilbert and Sullivan." The silence stretches. She looks away and a single tear wells in her artificial eye. My father sobs. The people continue to judge models on character level tasks.
I think you're absolutely right that judging LLMs' "intelligence" on their ability to count letters is silly. But there's something else, something that to my mind is much more damning, in that conversation WaltPurvis reported.
Imagine having a conversation like that with a human who for whatever reason (some sort of dyslexia, perhaps) has trouble with spelling. Don't you think that after you point out New York and New Jersey even a not-super-bright human being would notice the pattern and go, hang on, are there any other "New ..." states I might also have forgotten?
Gemini 2.5 Pro, apparently, doesn't notice anything of the sort. Even after New York and New Jersey have been followed by New Mexico, it doesn't think of New Hampshire.
(The point isn't that it forgets New Hampshire. A human could do that too. I am sure I myself have forgotten New Hampshire many times. It's that it doesn't show any understanding that it should be trying to think of other New X states.)
5 replies →
Let's ignore whatever BPE is for a moment. I, frankly, don't care about the technical reason these tools exhibit this idiotic behavior.
The LLM is generating "reasoning" output that breaks down the problem. It's capable of spelling out the word. Yet it hallucinates that the letter between the two 'A's in 'Hawaii' is 'I', followed by some weird take that it can be confused for a 'W'.
So if these tools are capable of reasoning and are so intelligent, surely they would be able to overcome some internal implementation detail, no?
Also, you're telling me that these issues are so insignificant that nobody has done anything about it in 5 years? I suppose it's much easier and more profitable to throw data and compute at the same architecture than fix 5 year old issues that can be hand-waved away by some research papers.
Cool story bro, but where's your argument? What kind of intelligence is one that can't pass a silly test?
It's not really news that today's AIs make dumb mistakes, especially around BPE tokenization.
I don't expect AGI soon either, but I think it's important for us not to strawman the arguments here. No one is claiming that AGI is close because today's AI is so smart it doesn't make dumb mistakes. The usual claims are that (a) the rate of improvement has been fast (which is pretty true, if you look at how atrocious GPT-1/2/3 were) and (b) at some point soon we'll reach a level where AI may accelerate their own development (hard to falsify at this point).
I think it's also important to realize that for AGI to arrive, only 1 model out of many attempts needs to qualify. Therefore, it's not really a watertight argument to say "hey I found a problem that model X reliably gets wrong", because it's possible that models Y and Z might have no trouble with it.
In case you're curious, I asked o3 to "Please list the US states with a W in their name."
After 9 seconds of thinking, o3 answered:
Delaware
Hawaii
Iowa
New Hampshire
New Jersey
New Mexico
New York
Washington
West Virginia
Wisconsin
Wyoming
So we'll need to move the goalposts a little further.
> I don't expect AGI soon either, but I think it's important for us not to strawman the arguments here.
This is not a strawman. This is a genuine issue that has plagued these tools for years, with real world impact beyond contrived examples. Yet users are expected to ignore it because this is how they work? Nonsense. It's insulting that you would trivialize something like this.
> (a) the rate of improvement has been fast
I wouldn't describe it as "fast". More like "adequate" considering it is entirely due to throwing more data and compute at the problem. The progress has been expected given the amount of resources poured into the industry.
Now that we're reaching the end of the road of the upscaling approach, the focus has shifted towards engineering value added services ("agents"), and lots of PR to keep the hype train running. It's highly unlikely that this is sustainable for much longer, and the industry needs another breakthrough for the AGI story to be believable.
> (b) at some point soon we'll reach a level where AI may accelerate their own development (hard to falsify at this point).
Why isn't this happening today? Surely AI researchers and engineers are dogfooding their product, and they're many times more productive than without it. Why are then improvements still incremental? Why are we still talking about the same issues after all these years? Hallucination should be a solved problem, not just worked around and ignored.
> I think it's also important to realize that for AGI to arrive, only 1 model out of many attempts needs to qualify.
All models have the same issues. Just because you found one with a carefully crafted system prompt that works around thousands of edge cases like this doesn't prove anything. Or are you implying that o3 doesn't use BPE?
> So we'll need to move the goalposts a little further.
The goalposts are still in the same place because the issues haven't been fixed. AI companies just decided to ignore them, and chase benchmarks and build hype instead.
You should have asked it to to write a python program that does the task. LLMs are really bad in these kind of task but you can just ask it to write a program.
chef's kiss
Cue the apologists claiming that you're holding it wrong as this is a really hard problem for LLMs because of how they work. Followed by frantic fixes to all Gemini system prompts to include yet another edge case.
And it's still debatable whether these systems are intelligent... FFS.
Related Dwarkesh discussion from a couple of months ago:
https://news.ycombinator.com/item?id=43719280
(AGI Is Still 30 Years Away – Ege Erdil and Tamay Besiroglu | 174 points | 378 comments)
AGI is a scam. I'm pretty sure every big name in AI knows it's nowhere near and LLMs won't get us there. It's just marketing helping Sam and alike to get those billions and the hype alive.
How can it be if we have yet to fully define what human intellect is or how it works? Not to mention consciousness. Machine intelligence will always be different than human intelligence.
Alan Turing had a great test (not definition) of AGI, which we seem to have forgotten. No I don't think an LLM can pass a Turing Test (at least I could break it).
The actual test discussed by Turing wasn't very demanding. There has been some goal post shifting since then.
I think AGI already exists in multiple datacenters. Only justification for these huge moves being made for capacity that couldnt possibly be needed.
The fact that no one is publishing numbers on how big their models are now is an indicative they hit a wall on training those models.
AGI is not going to happen. Fake it till you make it only goes so far.
The funny thing is that some people actually think they want that.
Not even the people predicting it think so. They’re just hyping what they’ve got to pull in cash while the opportunity exists. Soon, AI will be viewed the same as crypto mid-2022. Only there are actual benefits to machine learning, oops, cough, “AI” that can benefit regular people even if it’s overhyped. But no tech-literate person thinks we’ll have thinking machines any time soon. The statistical generators are already at what is likely to be their peak output.
I think current LLMs are smart enough to trigger the intelligence explosion. And that we are in the early stages of that.
An explosion by definition happens in a very short period of time. How long is supposed to be this explosion we already are? It reads like a Tom Clancy book.
Timeframes are hard to predict. I just notice these signs: That it can suggest and reason about strategies to improve itself, and break those strategies down into smaller goals. That it can integrate with toals to (to some extent) work on those goals.
Nobody has agreed on any definition of AGI, there are plenty of “makes sense” definitions though.
Honestly, I think LLM's are a distraction from AGI. It seems to me that the path to AGI will likely be some sort of Reinforcement Learning approach.
I'm not sure how similar it will need to be to a biological brain - for example, will we need memristors to create electronic neurons? Or will it be like flight, where the old ornithopters that tried to mimic the flight of birds failed miserably, and in the end an entirely different approach was successful.
LLMs seem quite similar to the part of the human brain where you speak quickly without thinking. They don't do the thinking and learning bit that brains do well though. Something needs to be changed or added on I guess.
AGI should be able to answer this question
What AI Can Never Be | John Vervaeke
https://youtu.be/HAJclcj25uM
Can artificial intelligence truly become wise? In this landmark lecture, John Vervaeke explores the future of AI through a lens few dare to examine: the limits of intelligence itself. He unpacks the critical differences between intelligence, rationality, reasonableness, and wisdom—terms often used interchangeably in discussions around AGI. Drawing from decades of research in cognitive science and philosophy, John argues that while large language models like ChatGPT demonstrate forms of generalized intelligence, they fundamentally lack core elements of human cognition: embodiment, caring, and participatory knowing.
By distinguishing between propositional, procedural, perspectival, and participatory knowing, he reveals why the current paradigm of AI is not equipped to generate consciousness, agency, or true understanding. This lecture also serves as a moral call to action: if we want wise machines, we must first become wiser ourselves.
00:00 Introduction: AI, AGI, and the Nature of Intelligence 02:00 What is General Intelligence? 04:30 LLMs and the Illusion of Generalization 07:00 The Meta-Problems of Intelligence: Anticipation & Relevance Realization 09:00 Relevance Realization: The Hidden Engine of Intelligence 11:30 How We Filter Reality Through Relevance 14:00 The Limits of LLMs: Predicting Text vs. Anticipating Reality 17:00 Four Kinds of Knowing: Propositional, Procedural, Perspectival, Participatory 23:00 Embodiment, Consciousness, and Narrative Identity 27:00 The Role of Attention, Care, and Autopoiesis 31:00 Culture as Niche Construction 34:00 Why AI Can’t Participate in Meaning 37:00 The Missing Dimensions in LLMs 40:00 Rationality vs. Reasonableness 43:00 Self-Deception, Bias, and the Need for Self-Correction 46:00 Caring About How You Care: The Core of Rationality 48:00 Wisdom: Aligning Multiple Selves and Temporal Scales 53:00 The Social Obligation to Cultivate Wisdom 55:00 Alter: Cultivating Wisdom in an AI Future
Please don't just link to another video or article without a comment or at least a short description. Why should I watch this 1 hour long video?
I've said this before but I'll say it again- AGI is right around the corner, because we don't have a technical definition of what it is... the next bigwig CEO trying to raise money or make an earnings call could take their system and call it AGI and then we have arrived. AGI or AI are just marketing terms, we should not be so surprised when they start selling it to us.
Here I was worried.
Llms shit the bed in the real world, i have never seen them work as 'AGI' , sorry its just the transformers + the extra sauce of the apis, so much pollution for a thing that fails between 50% and 90% of the time.
Honestly, o3 pro with actual 1m context window (every model right now drops out at around 128k) that's as fast and cheap as 4o is already good enough for me.
o3 pro doesn't have a 1M context window, unfortunately. GPT-4.1 and Gemini 2.5 do, though.
That's why I said "if". And that's a lot more plausible than an AGI.
2 replies →
Thats nice to know.
What's that got to do with this post though.
I don't feel like the AGI people are talking about isn't necessary, something like that would at minimum require as much compute as neurons and synapses that of a teen (minus the requirements to maintain a body).
I think what we have right now with some (very difficult to achieve, but possible in the forseeable future) tweaks we can already see 95% of what an "AGI" could do come true: put most of the population out of jobs, work together and improve itself (to a limited degree) and cause general chaos.
1 reply →
I’ve noticed it’s becoming a lot more popular lately for people to come out and say AGI is still very, very far away. Is the hype cycle ending somewhat? Have we passed peak LLM?
Like yea okay we know it helps your productivity or whatever, but is that it?
Patel isn't saying that AGI is still very, very far away. He says his best estimate is 2032 and "ASI" in 2028 is "a totally plausible outcome".
(He thinks it might be quite a long way away: "the 2030s or even the 2040s", and it seems to me that the "2040s" scenarios are ones in which substantially longer than that is also plausible.)
Maybe - anecdotally, HN at least is not very receptive to the idea that transformers are not (or with more data will never be) sentient somehow, and almost every post I see about this is followed up by the idiotic "how do we know human intelligence isn't the same thing", as if there's some body of commentators whose personal experience with consciousness somehow leads them to believe it might be achievable with matrix math.
Anyway, I don't think we're over the peak yet, the tech adjacent pseudo intellectuals that feed these bubbles (VCs etc) still very much think that math that generates a plausible transcript is alive.
> experience with consciousness somehow leads them to believe it might be achievable with matrix math
That's trivially true if you subscribe to materialism; QM is "just matrix math".
2 replies →
Welly, welly, welly, welly, welly, welly, well! No fucking shit!
[dead]
[dead]
AGI is never coming. It's too hard, too expensive, and there's absolutely no valid usecase. Fulfilling the god-complexes and/or fetishes of tech moguls is not enough to make the effort worth it.
No valid use-case? If AGI at a human level were to exist, and cost less than hiring an equivalent human, it could replace most/all knowledge workers.
AGI would need the whole management infrastructure and bureacracy that we use to manage humans. Specially-intelligent rather than generally-intelligent AI would be run-and-done. AGI is the worse option.
How can you claim there's no valid use case for AGI?
We already have enormous adoption for near-AGI.
That "near-AGI" isn't all that near to AGI, and yet it still does what's needed. A fully autonomous intelligence would lessen it's usefulness and lower the efficiency. Nobody wants their electronic slave to get its own ideas or slack off burning cycles on some other random shit like a human would.
Am I missing something? Why is his opinion relevant? I'm not going to read all that unless there's some signal of some kind. Podcast bros and their hype cycles are tiresome.
Whose opinion would you want to hear? The CEO of an AI company?
A scientist with published works. Or at least someone that wrote a well sourced book and asked many people.
1 reply →