← Back to context

Comment by godelski

2 days ago

   > you could probably build better ones today on top of the current tech stack.

In a way, this is being done. If you look around a little you'll see a bunch of jobs that pay like $50+/hr for anyone with a hard science degree to answer questions. This is one of the ways they're collecting data and trying to create new data.

If we're saying expert systems are exclusively decision trees, then yeah, I think it would be a difficult argument to make[0]. But if you're using the general concept of a system that has a strong knowledge base but superficial knowledge, well current LLMs have very similar problems to expert systems[1].

I'm afraid that people read this as "LLMs suck" or "LLMs are useless" but I don't think that at all. Expert systems are pretty useful, as you mention. You get better use out of your tools when you understand what they can and can't do. What they are better at and worse at, even when they can do things. LLMs are great, but oversold.

  > Go feed an LLM one of the puzzles from here

These are also good. But mind you, both are online and have been for awhile. All these problems should be assumed to be within the training data.

  https://www.oebp.org/welcome.php

[0] We'd need more interpretibility of these systems and then you'd have to resolve the question of if superpositioning is allowed in decision trees. But I don't think LLMs are just fancy decision trees

[1] https://en.wikipedia.org/wiki/Expert_system#Disadvantages

generally, these class of constraint satisfaction problems fall under the "zebra puzzle" (or einstein puzzle) umbrella [1]. They are interesting because they posit a world with some axioms, and inference procedures, and ask if a certain statement result from them. LLMs as-is (without provers or tool usage) would have a difficult time with these constraint-satisfaction puzzles. 3-sat is a corner-case of these puzzles, and if LLMs could solve them in P time, then we have found a constructive proof of P=NP lol !

[1] https://en.wikipedia.org/wiki/Zebra_Puzzle

> In a way, this is being done. If you look around a little you'll see a bunch of jobs that pay like $50+/hr for anyone with a hard science degree to answer questions. This is one of the ways they're collecting data and trying to create new data.

This is what expert systems did, and why they fell apart. The cost of doing this, ongoing, forever never justified the results. It likely still would not even at minimum wage, and maybe more so because LLM's require so much more data.

> All these problems should be assumed to be within the training data.

And yet most models are going to fall flat on their face with these. "In the data" isnt enough for it to make the leaps to a solution.

The reality is that "language" is just a representation of knowledge. The idea that we're going to gather enough examples and jump to intelligence is a mighty large assumption. I dont see an underlying commutative property at work in any of the LLM's we have today. The sooner we get to an understand that there is no (a)I coming, the sooner we can get down to building out LLM's to their full (if limited) potential.

  • That expert systems were not financially viable in the 80-90ies does not mean that LLM-as-modern-expert-systems cannot be now. The addressable market for such solutions has expanded enormously since then, thanks to PCs, servers, and mobile phones becoming ubiquitous. The opportunity is likely 1000x (or more) what is was back in those days - which shifts the equation a lot as to what is viable or not.

    BTW, have you tried out your challenges with a LLM? I expect them to be tricky to answer direct. But the systems are getting quite good at code synthesis which seems suitable. And I even see some MCP implementations for using constraint solvers as tools.

    • I agree with both of you and disagree (as my earlier comment implies)

      Expert systems can be quite useful, especially when there's an extended knowledge base. But the major issue with expert systems is that you generally need to be an expert to evaluate them.

      That's the major issue with LLMs today. They're trained on human preference. Unfortunately we humans prefer incorrect things that sound/look correct than incorrect things that sound/look correct. So that means they're optimizing so that errors are hard to detect. They can provide lots of help to very junior people because they're far from expert but it's diminishing returns and can increase workload if you're concerned with details.

      They can provide a lot of help but the people most vocal about their utility usually aren't aware of these issues or admit them while talking about how to effectively use them. But then again, that can just be because you can be tricked. Like Feynman said, the first rule is to not be fooled and you're the easiest person to fool.

      Personally, I'm wary of tools that mask errors. IMO a good tool makes errors loud and noticeable. To complement the tool user. I'll admit LLM coding feels faster, because it reduces my cognitive load while code is being written but if I actually time myself I find it usually takes longer and I spend more time debugging and less aware of how the system acts as a whole. So I'll use it for advice but have yet to be able to hand over trust. Even though I can't trust a junior engineer I can trust that they'll learn and listen

    • > BTW, have you tried out your challenges with a LLM?

      I have they whiff pretty hard.

      > getting quite good at code synthesis

      There was a post yesterday about vibe coding basic. It pretty much reflects my experience with code for SBC's.

      I run home assistant, it's got tons of sample code out there so lots for the LLM to have in its data set. Here they thrive.

      It's a great tool with some known hard limits. It's great at spitting back language but it clearly lacks knowledge. It clearly lacks the reason to understand the transitive properties of things, leaving it lacking at the edges.