← Back to context

Comment by roadside_picnic

4 days ago

PGMs are a great topic to explore (and Koller+Friedman is a great book), but, as a word of caution to anyone interested: implementation of any of these more advanced models remains a major challenge. For anyone building production facing models, even if your problem is a pretty good match for the more interesting PGMs, the engineering requirements alone are a good reason not to go too far down that path.

The PGM book is also structured very clearly for researchers in PGMs. The book is laid out in 3 major section: the models, inference techniques (the bulk of the book), and learning. Which means, if you follow the logic of the book, you basically have to work through 1000+ pages of content before you can actually start running even toy versions of these models. If you do need to get into the nitty-gritty of particular inference algorithms, I don't believe there is another textbook with nearly the level of scope and detail.

Bishop's section on PGMs from Pattern Recognition and Machine Learning is probably a better place to start learning about these more advanced models, and if you become very interested then Koller+Friedman will be an invaluable text.

It's worth noting that the PGM course taught by Koller was one of the initial, and still very excellent, Coursera courses. I'm not sure if it's still free, but it was a nice way to get a deep dive into the topic in a reasonably short time frame (I do remember those homeworks as brutal though!)[0].

0. https://www.coursera.org/specializations/probabilistic-graph...

Played with Bayesian nets a bit in grad school—Pearl’s causality stuff is still mind-blowing—but I’ve almost never bumped into a PGM in production. A couple things kept biting us: Inference pain. Exact is NP-hard, and the usual hacks (loopy BP, variational, MCMC) need a ton of hand-tuning before they run fast enough.

The data never fits the graph. Real-world tables are messy and full of hidden junk, so you either spend weeks arguing over structure or give up the nice causal story.

DL stole the mind-share. A transformer is a one-liner with a mature tooling stack; hard to argue with that when deadlines loom.

That said, they’re not completely dead - reportedly Microsoft’s TrueSkill (Xbox ranking), a bunch of Google ops/diagnosis pipelines, some healthcare diagnosis tools by IBM Watson built on Infer.NET.

Anyone here actually shipped a PGM that beat a neural baseline? Would really love to appreciate your war stories.

  • Me either. I have heard stories of it happening, but never personally seen one live. It's really a tooling issue. I think the causal story is super important and will only become more so in the future, but it would be basically impossible to implement and maintain longer-term with today's software.

    Kind of like flow-based programming. I don't think there are any fundamental reason why it can't work, it just hasn't yet.