← Back to context

Comment by disambiguation

6 hours ago

I am once again shilling the idea that someone should find a way to glue Prolog and LLMs together for better reasoning agents.

https://news.ycombinator.com/context?id=43948657

Thesis:

1. LLMs are bad at counting the number of r's in strawberry.

2. LLMs are good at writing code that counts letters in a string.

3. LLMs are bad at solving reasoning problems.

4. Prolog is good at solving reasoning problems.

5. ???

6. LLMs are good at writing prolog that solves reasoning problems.

Common replies:

1. The bitter lesson.

2. There are better solvers, ex. Z3.

3. Someone smart must have already tried and ruled it out.

Successful experiments:

1. https://quantumprolog.sgml.net/llm-demo/part1.html

> "4. Prolog is good at solving reasoning problems."

Plain Prolog's way of solving reasoning problems is effectively:

    for person in [martha, brian, sarah, tyrone]:
      if timmy.parent == person:
        print "solved!"

You hard code some options, write a logical condition with placeholders, and Prolog brute-forces every option in every placeholder. It doesn't do reasoning.

Arguably it lets a human express reasoning problems better than other languages by letting you write high level code in a declarative way, instead of allocating memory and choosing data types and initializing linked lists and so on, so you can focus on the reasoning, but that is no benefit to an LLM which can output any language as easily as any other. And that might have been nice compared to Pascal in 1975, it's not so different to modern garbage collected high level scripting languages. Arguably Python or JavaScript will benefit an LLM most because there are so many training examples inside it, compared to almost any other langauge.

  • Its a Horn clause resolver...that's exactly the kind of reasoning that LLMs are bad at. I have no idea how to graft Prolog to an LLM but if you can graft any programming language to it, you can graft Prolog more easily.

    Also, that you push Python and JavaScript makes me think you don't know many languages. Those are terrible languages to try to graft to anything. Just because you only know those 2 languages doesn't make them good choices for something like this. Learn a real language Physicist.

    • We would begin by having a Prolog server of some kind (I have no idea if Prolog is parallelized but it should very well be if we're dealing with Horn Clauses).

      There would be MCP bindings to said server, which would be accessible upon request. The LLM would provide a message, it could even formulate Prolog statements per a structured prompt, and then await the result, and then continue.

  • Prolog was introduced to capture natural language - in a logic/symbolic way that didn't prove as powerful as today's LLM for sure, but this still means there is a large corpus of direct English to Prolog mappings available for training, and also the mapping rules are much more straightforward by design. You can pretty much translate simple sentences 1:1 into Prolog clauses as in the classic boring example

        % "the boy eats the apple"
        eats(boy, apple).
    

    This is being taken advantage of in Prolog code generation using LLMs. In the Quantum Prolog example, the LLM is also instructed not to generate search strategies/algorithms but just planning domain representation and action clauses for changing those domain state clauses which is natural enough in vanilla Prolog.

    The results are quite a bit more powerful, close to end user problems, and upward in the food chain compared to the usual LLM coding tasks for Python and JavaScript such as boilerplate code generation and similarly idiosyncratic problems.

  • What makes you think your brain isn't also brute forcing potential solutions subconciously and only surfacing the useful results?

  • Of course it does "reasoning", what do you think reasoning is? From a quick google: "the action of thinking about something in a logical, sensible way". Prolog searches through a space of logical proposition (constraints) and finds conditions that lead to solutions (if one exists).

    (a) Trying adding another 100 or 1000 interlocking proposition to your problem. It will find solutions or tell you one doesn't exist. (b) You can verify the solutions yourself. You don't get that with imperative descriptions of problems. (b) Good luck sandboxing Python or JavaScript with the treat of prompt injection still unsolved.

    • Of course it doesn't "do reasoning", why do you think "following the instructions you gave it in the stupidest way imaginable" is 'obviously' reasoning? I think one definition of reasoning is being able to come up with any better-than-brute-force thing that you haven't been explicitly told to use on this problem.

      Prolog isn't "thinking". Not about anything, not about your problem, your code, its implementation, or any background knowledge. Prolog cannot reason that your problem is isomorphic to another problem with a known solution. It cannot come up with an expression transform that hasn't been hard-coded into the interpreter which would reduce the amount of work involved in getting to a solution. It cannot look at your code, reason about it, and make a logical leap over some of the code without executing it (in a way that hasn't been hard-coded into it by the programmer/implementer). It cannot reason that your problem would be better solved with SLG resolution (tabling) instead of SLD resolution (depth first search). The point of my example being pseudo-Python was to make it clear that plain Prolog (meaning no constraint solver, no metaprogramming), is not reasoning. It's no more reasoning than that Python loop is reasoning.

      If you ask me to find the largest Prime number between 1 and 1000, I might think to skip even numbers, I might think to search down from 1000 instead of up from 1. I might not come up with a good strategy but I will reason about the problem. Prolog will not. You code what it will do, and it will slavishly do what you coded. If you code counting 1-1000 it will do that. If you code Sieve of Eratosthenes it will do that instead.

      2 replies →

  • Everything you've written here is an invalid over-reduction, I presume because you aren't terribly well versed with Prolog. Your simplification is not only outright erroneous in a few places, but essentially excludes every single facet of Prolog that makes it a turing complete logic language. What you are essentially presenting Prolog as would be like presenting C as a language where all you can do is perform operations on constants, not even being able to define functions or preprocessor macros. To assert that's what C is would be completely and obviously ludicrous, but not so many people are familiar enough with Prolog or its underlying formalisms to call you out on this.

    Firstly, we must set one thing straight: Prolog definitionally does reasoning. Formal reasoning. This isn't debatable, it's a simple fact. It implements resolution (a computationally friendly inference rule over computationally-friendly logical clauses) that's sound and refutation complete, and made practical through unification. Your example is not even remotely close to how Prolog actually works, and excludes much of the extra-logical aspects that Prolog implements. Stripping it of any of this effectively changes the language beyond recognition.

    > Plain Prolog's way of solving reasoning problems is effectively:

    No. There is no cognate to what you wrote anywhere in how Prolog works. What you have here doesn't even qualify as a forward chaining system, though that's what it's closest to given it's somewhat how top-down systems work with their ruleset. For it to even approach a weaker forward chaining system like CLIPS, that would have to be a list of rules which require arbitrary computation and may mutate the list of rules it's operating on. A simple iteration over a list testing for conditions doesn't even remotely cut it, and again that's still not Prolog even if we switch to a top-down approach by enabling tabling.

    > You hard code some options

    A Prolog knowledgebase is not hardcoded.

    > write a logical condition with placeholders

    A horn clause is not a "logical condition", and those "placeholders" are just normal variables.

    > and Prolog brute-forces every option in every placeholder.

    Absolutely not. It traverses a graph proving things, and when it cannot prove something it backtracks and tries a different route, or otherwise fails. This is of course without getting into impure Prolog, or the extra-logical aspects it implements. It's a fundamentally different foundation of computation which is entirely geared towards formal reasoning.

    > And that might have been nice compared to Pascal in 1975, it's not so different to modern garbage collected high level scripting languages.

    It is extremely different, and the only reason you believe this is because you don't understand Prolog in the slightest, as indicated by the unsoundness of essentially everything you wrote. Prolog is as different from something like Javascript as a neural network with memory is.

  • Even in your example (which is obviously not correct representation of prolog), that code will work X orders magnitude faster and with 100% reliability compared to much more inferior LLM reasoning capabilities.

As someone who did deep learning research 2017-2023, I agree. "Neurosymbolic AI" seems very obvious, but funding has just been getting tighter and more restrictive towards the direction of figuring out things that can be done with LLMs. It's like we collectively forgot that there's more than just txt2txt in the world.

We've done this, and it works. Our setup is to have some agents that synthesize Prolog and other types of symbolic and/or probabilistic models. We then use these models to increase our confidence in LLM reasoning and iterate if there is some mismatch. Making synthesis work reliably on a massive set of queries is tricky, though.

Imagine a medical doctor or a lawyer. At the end of the day, their entire reasoning process can be abstracted into some probabilistic logic program which they synthesize on-the-fly using prior knowledge, access to their domain-specific literature, and observed case evidence.

There is a growing body of publications exploring various aspects of synthesis, e.g. references included in [1] are a good starting point.

[1] https://proceedings.neurips.cc/paper_files/paper/2024/file/8...

I am once again shilling the idea that someone should find a way to glue Prolog and LLMs together for better reasoning agents.

There are definitely people researching ideas here. For my own part, I've been doing a lot of work with Jason[1], a very Prolog like logic language / agent environment with an eye towards how to integrate that with LLMs (and "other").

Nothing specific / exciting to share yet, but just thought I'd point out that there are people out there who see potential value in this sort of thing and are investigating it.

[1]: https://github.com/jason-lang/jason

IIRC IBM’s Watson (the one that played Jeopardy) used primitive NLP (imagine!) to form a tree of factual relations and then passed this tree to construct Prolog queries that would produce an answer to a question. One could imagine that by swapping out the NLP part with an LLM, the model would have 1. a more thorough factual basis against which to write Prolog queries and 2. a better understanding of the queries it should write to get at answers (for instance, it may exploit more tenuous relations between facts than primitive NLP).

  • Please tell me that's approximately what Palantir Ontology is, because if it isn't, I've no idea what it could be.

Prolog doesn't look like javascript or python so:

1. web devs are scared of it.

2. not enough training data?

I do remember having to wrestle to get prolog to do what I wanted but I haven't written any in ~10 years.

  • It's been a while since I have done web dev, but web devs back then were certainly not scared of any language. Web devs are like the ultimate polyglots. Or at least they were. I was regularly bouncing around between a half dozen languages when I was doing pro web dev. It was web devs who popularized numerous different languages to begin with simply because delivering apps through a browser allowed us a wide variety of options.

    • No web dev I have ever met could use Prolog well. I think your statement about web devs being polyglots is based upon the fact that web devs chase every industry fad. I think that has a lot to do with the nature and economics of web dev work (I'm not blaming the web devs for this). I mean the best way to succeed as a webdev is to write your own version of a framework that does the same thing as the last 10 frameworks but with better buzzword marketing.

      Generally speaking, all the languages they know are pretty similar to each other. Bolting on lambdas isn't the same as doing pure FP. Also, anytime a problem comes up where you would actually need a weird language based upon different math, those problems will be assigned to some other kind of developer (probably one with a really strong CS background).

If you are looking for AGI. And you understand what is going on inside of it - then it is obviously not AGI.

Wouldn’t that be like a special case of neuro-symbolic programming?! There are plenty of research going on

Can't find the links right now, but there were some papers on llm generating prolog facts and queries to ground the reasoning part. Somebody else might have them around.

  • There's a lot of work in this area. See e.g., the LoRP paper by Di et al. There's also a decent amount of work on the other side too, i.e., using LLMs to convert Prolog reasoning chains back into natural language.

> LLMs are bad at counting the number of r's in strawberry.

This is a tokenization issue, not an LLM issue.

YES! I've run a few experiments on classical logic problems and an LLM can spit out Prolog programs to solve the puzzel. Try it yourself, ask an LLM to write some prolog to solve some problem and then copy paste it to https://swish.swi-prolog.org/ and see if it runs.