Comment by tombert

8 hours ago

Claude has certainly been getting better with TLA+. It's not perfect yet but for laughs I got it to model the rules of Monopoly last night [1]. I haven't done any exhaustive checking on it yet, but it certainly looks passable.

It is pretty impressive at how good it's gotten at this, in a relatively short amount of time no less. I still usually write my specs by hand, but who knows how much longer I'll be doing that.

[1] https://pdfhost.io/v/KU2j37YKrP_Monopoly

5 comments

tombert

randusername 2 hours ago

What's the advantage of provable correctness if it's apparently not easy to prove even for people who understand TLA+? I'm not trying to be a party pooper, just curious.

Isn't logical incorrectness less of a problem in software than failures of imagination or conscientiousness in modeling the domain?

ofrzeta 7 hours ago

It looks quite complicated and I have no idea what it is doing. Obviously, since I don't know about TLA+. But what about someone who knows TLA+? It still seems hard to make sure it is valid. And it's just for a relatively simple game.

comex 4 hours ago

Well, for one thing:
> Decline to buy: property stays with bank (auction abstracted out)
Ignoring an entire game mechanic is really stretching the definition of “abstracted out”…
Also, at the bottom it defines a “Liveness: someone eventually wins” property which I believe cannot be proven. Monopoly doesn’t have any rules forcing the game to end eventually. There is only a probabilistic guarantee, and even that only applies if the players are trying to win; if the players are conspiring to prevent the game from ending then they’re unlikely to fail.
_doctor_love 4 hours ago

There is a nice guide to TLA+ from Hillel Wayne here: https://learntla.com/
PlusCal is recommended as the gentler on-ramp to TLA+ for first learning.

NooneAtAll3 3 hours ago

> I haven't done any exhaustive checking on it yet, but it certainly looks passable.

isn't that exactly the kind of fails LLMs do the most? first-glance-passable nonsense?