Comment by alfalfasprout

2 months ago

TBH, as cool as TLA+ is, the biggest issue I generally see with trying to use formal methods in practice is that you need to keep the specification matching the actual implementation. Otherwise whatever you've formally verified doesn't match what actually got built.

So formal methods may be used for extremely critical parts of systems (eg; safety critical systems or in embedded where later fixes cannot be readily rolled out) but they fail to make inroads in most other development because it's a lot of extra work.

Hillel Wayne wrote a post[0] about this issue recently, but on a practical level I think I want to address it by writing a "how-to" on trace validation & model-based testing. There are a lot of projects out there that have tried this, where you either get your formal model to generate events that push your system around the state space or you collect traces from your system and validate that they're a correct behavior of your specification. Unfortunately, there isn't a good guide out there on how to do this; everybody kind of rolls their own, presents the conference talk, rinse repeat.

But yeah, that's basically the answer to the conformance problem for these sort of lightweight formal methods. Trace validation or model-based testing.

[0] https://buttondown.com/hillelwayne/archive/requirements-chan...)

  • To be totally fair, my article is about the problem of writing specs when your product features could change week to week, whereas I think u/alfalfasprout is talking about regular updates to an existing system slowly bringing it out of sync with the spec. For the latter problem, yeah trace validation and model-based testing is the best approach we have so far.

I've always seen it as a tool for validating a design rather than an implementation.

that is an issue but then.

it is probably better to try matching an implementation to a spec you know works, and debug that. than trying to match an implementation to an "in your head" spec which might not be correct in the first place. debugging the spec then the implementation seems to make more sense than debugging both at the same time.

a more vicious issue could be, does your model... models... enough reality for it to be relevant. because you can end up with a correct spec, a correct implementation, but in the end the cow wouldn't be round enough. but i don't know... i have too little experience in the domain.

On the other hand, how many million man hours are spent re inventing the wheel that could instead be spent contributing to a library of extremely well-specified wheels?