Comment by vatsachak

4 hours ago

As I have stated before, AI will win a fields medal before it can manage a McDonald's

A difficult part was constructing a chess board on which to play math (Lean). Now it's just pattern recognition and computation.

LLMs are just the beginning, we'll see more specialized math AI resembling StockFish soon.

48 comments

vatsachak

trostaft 4 hours ago

> A difficult part was constructing a chess board on which to play math (Lean). Now it's just pattern recognition and computation.

However, this was not verified in Lean. This was purely plain language in and out. I think, in many ways, this is a quite exciting demonstration of exactly the opposite of the point you're making. Verification comes in when you want to offload checking proofs to computers as well. As it stands, this proof was hand-verified by a group of mathematicians in the field.

vatsachak 3 hours ago
Yeah, but I wouldn't be surprised if they train the model on verification assisted by Lean.
- trostaft 3 hours ago
  
  Arguing similarly to how stockfish, the chess engine, trains I would not be surprised if this is more common in the future. I don't know if they use any proof verification tools during their reinforcement learning procedure right now, as far as I know they've been focusing more on COT based strategies (w/o Lean). But I'm hardly an LLM expert, I don't know.
ComplexSystems 2 hours ago
That may be true for now, but it seems clear enough that letting the model use Lean in its internal reasoning process would be a great idea
- trostaft 2 hours ago
  
  That I'd agree with! I really need to get around to learning Lean myself. It might be interesting to try and formalize some missing theoretical pieces from my field (or likely start smaller).

Terr_ 4 hours ago

> manage a McDonald's

Dystopia vibes from the fictional "Manna" management system [0] used at a hamburger franchise, which involved a lot of "reverse centaur" automation.

> At any given moment Manna had a list of things that it needed to do. There were orders coming in from the cash registers, so Manna directed employees to prepare those meals. There were also toilets to be scrubbed on a regular basis, floors to mop, tables to wipe, sidewalks to sweep, buns to defrost, inventory to rotate, windows to wash and so on. Manna kept track of the hundreds of tasks that needed to get done, and assigned each task to an employee one at a time. [...]

> At the end of the shift Manna always said the same thing. “You are done for today. Thank you for your help.” Then you took off your headset and put it back on the rack to recharge. The first few minutes off the headset were always disorienting — there had been this voice in your head telling you exactly what to do in minute detail for six or eight hours. You had to turn your brain back on to get out of the restaurant.

[0] https://en.wikipedia.org/wiki/Manna_(novel)

kmeisthax 3 hours ago
Casual reminder that the author's proposed solution to the labor-automation dystopia is to invent a second identity-verification dystopia. Also casual reminder that the author wanted the death penalty to anyone over the age of 65.
- embedding-shape 1 hour ago
  
  I was curious about this book but now you've absolutely sold me on it, sounds like I'm in for a ride!

Lerc 4 hours ago

I disagree. It will be able to perform work deserving if a fields medal before it is capable of running a McDonalds. I think it will be running a McDonalds well before either of those things happen, and a fields medal long after both have happened.

c7b 3 hours ago

One could hardly ask for a task better suited for LLMs than producing math in Lean. Running a restaurant is so much fuzzier, from the definition of what it even means to the relation of inputs to outputs and evaluating success.
vatsachak 3 hours ago
Not necessarily. Obviously playing Kasparov on the board requires more planning ability than managing a McDonald's but look at where chess bots are now.
There's much more to being human than our "cognitive abilities"
- baq 3 hours ago
  
  Conjecture: the first AI to successfully manage a McDonald’s will be a Gemini.
edbaskerville 4 hours ago
I just visited a McDonald's for the first time in a while. The self-order kiosk UI is quite bad. I think this is evidence in favor of the idea that an incompetent AI will soon be incompetently running a McDonald's.
- Silamoth 3 hours ago
  
  Out of curiosity, what issue did you have with the McDonald’s self-order kiosk? I actually think McDonald’s has the best kiosk I’ve ever encountered. The little animation that plays when you add an item to your cart is a little annoying (but I think they’ve sped that up). But otherwise, it’s everything I’d want. It shows you all the items, tells you every ingredient, and lets you add or remove ingredients. I have a better experience ordering through the kiosk than I do talking to a cashier.
  
  6 replies →

evenhash 4 hours ago

The proof is not written in Lean, though. It’s written in English and requires validation by human experts to confirm that it’s not gibberish.

vatsachak 3 hours ago

Yeah, but I wouldn't be surprised if they train the model on verification assisted by Lean

auggierose 2 hours ago

> A difficult part was constructing a chess board on which to play math

We have that chess board for quite a while now, over 40 years. And no, there is nothing special about Lean here, it is just herd mentality. Also, we don't know how much training with Lean helped this particular model.

sigmoid10 4 hours ago

Managing a McDonalds is a question of integration and modalities at this point. I don't think anyone still doubts that these models lack the reasoning capability or world knowledge needed for the job. So it's less of a fundamental technical problem and more of a process engineering issue.

andy12_ 3 hours ago

I disagree. Even frontier models still achieve way worse results than the human baseline in VendingBench. As long as models can't manage optimally something as simple as a vending machine, they have no hope of managing a McDonalds.
throw-the-towel 4 hours ago
The capability they lack is being able to be sued.
- pear01 4 hours ago
  
  Police officers are human. In the United States in the vast majority of cases you can't sue the police, only the community responsible for them.
  https://en.wikipedia.org/wiki/Qualified_immunity
  Assuming you can still sue McDonalds I am not sure if this is a problem in the robotic llm case. I'm also trying to imagine a case where you would want to sue the llm and not the company. Given robots/llm don't have free will I'm not sure the problem with qualified immunity making police unaccountable applies.
  There already exist a lot of similar conventions in corporate law. Generally, a main advantage of incorporation is protecting the people making the decisions from personal lawsuits.
  
  4 replies →

KalMann 2 hours ago

I think your analogy is good but I don't believe modern LLMs use Lean or any lean-like structure in their proofs. At least recent open source ones like DeepSeek can do advanced math without it (maybe the most cutting edge ones are doing it I can't say).

volkercraig 3 hours ago

> we'll see more specialized math AI resembling StockFish soon

Heuristically weighted directed graphs? Wow amazing I'm sure nobody has done that before.

vatsachak 3 hours ago
My claim is that LLMs waste a lot of time training on all available data.
Math is a sequence of formal rules applied to construct a proof tree. Therefore an AI trained on these rules could be far more efficient, and search far deeper into proof space
- red75prime 1 hour ago
  
  It has been tried. Lenat's Automated Mathematician, for example. The problem is that the system succumbs to combinatorial explosion, not knowing which directions are interesting/promising/productive. LLMs seem to pick up some kind of intuition from the data they are fed. The generated data might not have the needed "human touch" or whatever it is.

forinti 4 hours ago

AI is already too old for that.

whimsicalism 4 hours ago

the only thing keeping the mcdonalds from happening will be political, likewise the same with fields medal

ori_b 3 hours ago

We're automating art and science so that we can flip burgers. This future sucks.

vatsachak 3 hours ago
Math is a very specialized subset of art and science more amenable to automation.
- ori_b 1 hour ago
  
  The first thing we automated passably was art, even before programming. Were you not paying attention?
  This future still sucks. The tech industry is making the world a worse place.

segmondy 3 hours ago

our local AI models are already capable of running McDonalds.

soupspaces 4 hours ago

Lee Sedol, Move 37 https://www.reddit.com/r/singularity/comments/1l0z5yk/the_mo... Edit: I wasn't necessarily disagreeing. But on second thought the chessboard in this math analogy is being built, not just played in. This Hardy quote comes to mind https://www.goodreads.com/quotes/902543-it-proof-by-contradi...

vatsachak 3 hours ago

My claim is that we haven't even witnessed the move 37 of math yet. I am claiming that math AI is going to get even better

dyauspitr 3 hours ago

Nonsense. Have you been watching the figure live stream? Or the Unitree video from yesterday with real time novel action generation? We’re less than a year away. If you can cook a burger, assemble a sandwich and clean up surfaces you’re all of the way there.

vatsachak 3 hours ago
Fair. Let's see in a year. I'm willing to bet that nothing happens.
- dyauspitr 2 hours ago
  
  Yeah, it’s gonna be an exciting year. I still think you’ll need one human in there, but that’s about it.