Comment by vatsachak

18 hours ago

I've always said this but AI will win a fields medal before being able to manage a McDonald's.

Math seems difficult to us because it's like using a hammer (the brain) to twist in a screw (math).

LLMs are discovering a lot of new math because they are great at low depth high breadth situations.

I predict that in the future people will ditch LLMs in favor of AlphaGo style RL done on Lean syntax trees. These should be able to think on much larger timescales.

Any professional mathematician will tell you that their arsenal is ~ 10 tricks. If we can codify those tricks as latent vectors it's GG

74 comments

vatsachak

vatsachak 18 hours ago

Tricks are nothing but patterns in the logical formulae we reduce.

Ergo these are latent vectors in our brain. We use analogies like geometry in order to use Algebraic Geometry to solve problems in Number Theory.

An AI trained on Lean Syntax trees might develop it's own weird versions of intuition that might actually properly contain ours.

If this sounds far fetched, look at Chess. I wonder if anyone has dug into StockFish using mechanistic interpretability

myffical 17 hours ago

Some DeepMind researchers used mechanistic interpretability techniques to find concepts in AlphaZero and teach them to human chess Grandmasters: https://www.pnas.org/doi/10.1073/pnas.2406675122
hodgehog11 16 hours ago
This argument, that LLMs can develop new crazy strategies using RLVR on math problems (like what happened with Chess), turns out to be false without a serious paradigm shift. Essentially, the search space is far too large, and the model will need help to explore better, probably with human feedback.
https://arxiv.org/abs/2504.13837
- narrator 16 hours ago
  
  The search space for the game of Go was also thought to be too large for computers to manage.
  
  5 replies →
- throwaway27448 7 hours ago
  
  I agree that LLMs are a bad fit for mathematical reasoning, but it's very hard for me to buy that humans are a better fit than a computational approach. Search will always beat our intuition.
  
  1 reply →
slopinthebag 18 hours ago
Stockfish's power comes from mostly search, and the ML techniques it uses are mainly about better search, i.e. pruning branches more efficiently.
- vatsachak 18 hours ago
  
  The weights must still have some understanding of the chess board. Though there is always the chance that it makes no sense to us
  
  7 replies →
- Sopel 18 hours ago
  
  The ML techniques it uses are only about evaluation, but you were close

hodgehog11 16 hours ago

As a professional mathematician, I would say that a good proof requires a very good representation of the problem, and then pulling out the tricks. The latter part is easy to get operating using LLMs, they can do it already. It's the former part that still needs humans, and I'm perfectly fine with that.

threethirtytwo 15 hours ago
But are you ok with the trendline of ai improvement? The speed of improvement indicates humans will only get further and further removed from the loop.
I see posts like your all the time comforting themselves that humans still matter, and every-time people like you are describing a human owning an ever shrinking section of the problem space.
- hodgehog11 8 hours ago
  
  I used to be worried, but not so much anymore.
  It used to be the case that the labs were prioritising replacing human creativity, e.g. generative art, video, writing. However, they are coming to realise that just isn't a profitable approach. The most profitable goal is actually the most human-oriented one: the AI becomes an extraordinarily powerful tool that may be able to one-shot particular tasks. But the design of the task itself is still very human, and there is no incentive to replace that part. Researchers talk a bit less about AGI now because it's a pointless goal. Alignment is more lucrative.
  Basically, executives want to replace workers, not themselves.
  
  1 reply →
- tartoran 15 hours ago
  
  Humans needing to ask new question due to curiosity push the boundaries further, find new directions, ways or motivations to explore, maybe invent new spaces to explore. LLMs are just tools that people use. When people are no longer needed AI serves no purpose at all.
  
  2 replies →

pfdietz 2 hours ago

> Any professional mathematician will tell you that their arsenal is ~ 10 tricks. If we can codify those tricks as latent vectors it's GG

And if we can train the systems to discover new tricks, whoa Nelly.

madrox 18 hours ago

> I've always said this but AI will win a fields medal before being able to manage a McDonald's.

I love this and have a corollary saying: the last job to be automated will be QA.

This wave of technology has triggered more discussion about the types of knowledge work that exist than any other, and I think we will be sharper for it.

bitwize 17 hours ago
The ownership class will be sharper. They will know how to exploit capital and turn it into more capital with vastly increased efficiency. Everybody else will be hosed.
- madrox 14 hours ago
  
  I'm not sure if people will be more hosed than before. Historically, what makes people with capital able to turn things into more capital is its ability to buy someone's time and labor. Knowledge labor is becoming cheaper, easier, and more accessible. That changes the calculus for what is valuable, but not the mechanisms.
  
  1 reply →
- DoctorOetker 14 hours ago
  
  but what if we succeed in gamifying the latent knowledge in LLM's to upload it to our human brains, by some kind of speed / reaction game?
- zer00eyz 12 hours ago
  
  There is a fundamental problem with this thinking, you are making an assumption about scale. There is the apocryphal quote "I think there is a world market for maybe five computers".
  You have to believe that LLM scaling (down) is impossible or will never happen. I assure you that this is not the case.

Yoric 6 hours ago

> I predict that in the future people will ditch LLMs in favor of AlphaGo style RL done on Lean syntax trees. These should be able to think on much larger timescales.

This is certainly my hope.

In my spare time, I'm slowly, very slowly, inching towards a prototype of something that could work like that.

ryanar 17 hours ago

Are they actually producing new math? In the most recent ACM issue there was an article about testing AI against a math bench that was privately built by mathematicians, and what they found is that even though AI can solve some problems, it never truly has come up with something novel and new in mathematics, it is just good at drawing connections between existing research and putting a spin on it.

in-silico 13 hours ago
I'm not accusing you in particular, but I feel like there's a lot of circular reasoning around this point. Something like: AI can't discover "new math" -> AI discovers something -> since it was discovered by AI it must not be "new math" -> AI can't discover "new math"
For example, there was a recent post here about GPT-5.4 (and later some other models) solving a FrontierMath open problem: https://news.ycombinator.com/item?id=47497757
That would definitely be considered "new math" if a human did it, but since it was AI people aren't so sure.
- parineum 11 hours ago
  
  There is a kind of rubrik I use on stuff like this. If LLMs are discovering new math, why have I only read one or two articles where it's happening? Wouldn't it be happening with regularity?
  The most obvious example of this thinking is, if LLMs are replacing developers, why us open ai still hiring?
  
  3 replies →
hodgehog11 17 hours ago

It's finding constructions and counterexamples. That's different from finding new proof techniques, but still extremely useful, and still gives way to novel findings.

3abiton 16 hours ago

It will be heavily still reliant onexpert human input and interactions. Knuth is an expert, and know how to guide.

smokel 18 hours ago

I think this is mostly about existing legislature, not about technology.

In any other context than when your paycheck depends on it, you would probably not be following orders from a random manager. If your paycheck depended on following the instructions of an AI robot, the world might start to look pretty scary real soon.

jfim 12 hours ago

> If your paycheck depended on following the instructions of an AI robot, the world might start to look pretty scary real soon.
That's already the case, minus AI, for gig workers. Their only agency is to accept or decline a ride/delivery, the rest is follow instructions.
vatsachak 18 hours ago

There's a lot to being a manager
- Coherent customer interaction
- Common sense judgements
- Scheduling
- Quality control
All which are baked into humans but not so much into LLMs
Even if it were legal to have an LLM as a GM, I think it would fair poorly
throw3747488 17 hours ago

AI actually has to follow all rules, even the bad rules. Like when autonomous car drives super carefully.
Imagine mcdonald management would enforce dog related rules. No more filthy muppets! If dog harasses customers, AI would call cops, and sue for restraining order! If dog defecates in middle of restaurant, everything would get desinfected, not just smeared with towels!
Nutters would crucify AI management!

kelseyfrog 16 hours ago

As of now, no models have solved a Millennium Prize Problem[1].

1. https://mppbench.com/

raincole 10 hours ago

Most Fields medals winners haven't either, except one.
utopcell 10 hours ago

This is the real Litmus test isn't it? There will be a deafening silence from critics when AI decides P vs NP.

slopinthebag 18 hours ago

> AI will win a fields medal before being able to manage a McDonald's

Of course, because it takes multi-modal intelligence to manage a McDonalds. I.e. it requires human intelligence.

> I predict that in the future people will ditch LLMs in favor of AlphaGo style RL

Same for coding as well. LLM's might be the interface we use with other forms of AI though.

vatsachak 18 hours ago
Something like building Linux is more akin to managing a McDonald's than it is to a 10 page technical proof in Algebraic Groups.
Programming is more multimodal than math.
Something like performance engineering might be free lunch though
- hodgehog11 17 hours ago
  
  > Programming is more multimodal than math
  I have no idea how you come to this conclusion, when the evidence on the ground for those training models suggests it is precisely the opposite.
  We are much further along the path of writing code than writing new maths, since the latter often requires some degree of representational fluency of the world we live in to be relevant. For example, proving something about braid groups can require representation by grid diagrams, and we know from ARC-AGI that LLMs don't do great with this.
  Programming does not have this issue to the same extent; arguably, it involves the subset of maths that is exclusively problem solving using standard representations. The issues with programming are primarily on the difficulty with handling large volumes of text reliably.
  
  2 replies →
- slopinthebag 18 hours ago
  
  Yeah, it's hard to compare management and programming but they're both multimodal in very different ways. But there's gonna be entire domains in which AI dominates much like stockfish, but stockfish isn't managing franchises and there is no reason to expect that anytime soon.
  I feel like something people miss when they talk about intelligence is that humans have incredible breadth. This is really what differentiates us from artificial forms of intelligence as well as other animals. Plus we have agency, the ability to learn, the ability to critically think, from first principles, etc.
  
  4 replies →
- bitwize 17 hours ago
  
  But LLMs have proven themselves better at programming than most professional programmers.
  Don't argue. If you think Hackernews is a representative sample of the field then you haven't been in the field long enough.
  What LLMs have actually done is put the dream of software engineering within reach. Creativity is inimical to software engineering; the goal has long been to provide a universal set of reusable components which can then be adapted and integrated into any system. The hard part was always providing libraries of such components, and then integrating them. LLMs have largely solved these problems. Their training data contains vast amounts of solved programming problems, and they are able to adapt these in vector space to whatever the situation calls for.
  We are already there. Software engineering as it was long envisioned is now possible. And if you're not doing it with LLMs, you're going to be left behind. Multimodal human-level thinking need only be undertaken at the highest levels: deciding what to build and maybe choosing the components to build it. LLMs will take care of the rest.
  
  8 replies →

NamlchakKhandro 18 hours ago

I've never seen you say that

vatsachak 18 hours ago

You will have to take my word that I started saying this in Dec 2024 lol