Comment by jodrellblank
3 months ago
> "4. Prolog is good at solving reasoning problems."
Plain Prolog's way of solving reasoning problems is effectively:
for person in [martha, brian, sarah, tyrone]:
if timmy.parent == person:
print "solved!"
You hard code some options, write a logical condition with placeholders, and Prolog brute-forces every option in every placeholder. It doesn't do reasoning.
Arguably it lets a human express reasoning problems better than other languages by letting you write high level code in a declarative way, instead of allocating memory and choosing data types and initializing linked lists and so on, so you can focus on the reasoning, but that is no benefit to an LLM which can output any language as easily as any other. And that might have been nice compared to Pascal in 1975, it's not so different to modern garbage collected high level scripting languages. Arguably Python or JavaScript will benefit an LLM most because there are so many training examples inside it, compared to almost any other langauge.
>> You hard code some options, write a logical condition with placeholders, and Prolog brute-forces every option in every placeholder. It doesn't do reasoning.
SLD-Resolution with unification (Prolog's automated theorem proving algorithm) is the polar opposite of brute force: as the proof proceeds, the cardinality of the set of possible answers [1] decreases monotonically. Unification itself is nothing but a dirty hack to avoid having to ground the Herbrand base of a predicate before completing a proof; which is basically going from an NP-complete problem to a linear-time one (on average).
Besides which I find it very difficult to see how a language with an automated theorem prover for an interpreter "doesn't do reasoning". If automated theorem proving is not reasoning, what is?
___________________
[1] More precisely, the resolution closure.
> "as the proof proceeds, the cardinality of the set of possible answers [1] decreases"
In the sense that it cuts off part of the search tree where answers cannot be found?
will never do the slow_computation - but if it did, it would come up with the same result. How is that the polar opposite of brute force, rather than an optimization of brute-force?
If a language has tail call optimization then it can handle deeper recursive calls with less memory. Without TCO it would do the same thing and get the same result but using more memory, assuming it had enough memory. TCO and non-TCO aren't polar opposites, they are almost the same.
Rather, in the sense that during a Resolution-refutation proof, every time a new Resolution step is taken, the number of possible subsequent Resolution steps either gets smaller or stays the same (i.e. "decreases monotonically"). That's how we know for sure that if the proof is decidable there comes a point at which no more Resolution steps are left, and either the empty clause is all that remains, or some non-empty clause remains that cannot be reduced further by Resolution.
So basically Resolution gets rid of more and more irrelevant ...stuff as it goes. That's what I mean that it's "the polar opposite of brute force". Because it's actually pretty smart and it avoids doing the dumb thing of having to process all the things all the time before it can reach a conclusion.
Note that this is the case for Resolution, in the general sense, not just SLD-Resolution, so it does not depend on any particular search strategy.
I believe SLD-Resolution specifically (which is the kind of Resolution used in Prolog) goes much faster, first because it's "[L]inear" (i.e. in any Resolution step one clause must be one of the resolvents of the last step) and second because it's restricted to [D]efinite clauses and, as a result, there is only one resolvent at each new step and it's a single Horn goal so the search (of the SLD-Tree) branches in constant time.
Refs:
J. Alan Robinson, "A computer-oriented logic based on the Resolution principle" [1965 paper that introduced Resolution]
https://dl.acm.org/doi/10.1145/321250.321253
Robert Kowalski, "Predicate Logic as a Programming Language"
https://www.researchgate.net/publication/221330242_Predicate... [1974 paper that introduced SLD-Resolution]
2 replies →
I don't understand (the point of) your example. In all branches of the search `X > 5` will never be `true` so yeah `slow_computation` will not be reached. How does that relate to your point of it being "brute force"
>> but if it did, it would come up with the same result
Meaning either changing the condition or the order of the clauses. How do you expect Prolog to proceed to `slow_computation` when you have declared a statement (X > 5) that is always false before it.
1 reply →
Prolog was introduced to capture natural language - in a logic/symbolic way that didn't prove as powerful as today's LLM for sure, but this still means there is a large corpus of direct English to Prolog mappings available for training, and also the mapping rules are much more straightforward by design. You can pretty much translate simple sentences 1:1 into Prolog clauses as in the classic boring example
This is being taken advantage of in Prolog code generation using LLMs. In the Quantum Prolog example, the LLM is also instructed not to generate search strategies/algorithms but just planning domain representation and action clauses for changing those domain state clauses which is natural enough in vanilla Prolog.
The results are quite a bit more powerful, close to end user problems, and upward in the food chain compared to the usual LLM coding tasks for Python and JavaScript such as boilerplate code generation and similarly idiosyncratic problems.
"large corpus" - large compared to the amount of Python on Github or the amount of JavaScript on all the webpages Google has ever indexed? Quantum Prolog doesn't have any relevant looking DuckDuckGo results, I found it in an old comment of yours here[1] but the link goes to a redirect which is blocked by uBlock rules and on to several more redirects beyond which I didn't get to a page. In your linked comment you write:
> "has convenient built-in recursive-decent parsing with backtracking built-in into the language semantics, but also has bottom-up parsing facilities for defining operator precedence parsers. That's why it's very convenient for building DSLs"
which I agree with, for humans. What I am arguing is that LLMs don't have the same notion of "convenient". Them dumping hundreds of lines of convoluted 'unreadable' Python (or C or Go or anything) to implement "half of Common Lisp" or "half of a Prolog engine" for a single task is fine, they don't have to read it, and it gets the same result. What would be different is if it got a significantly better result, which I would find interesting but haven't seen a good reason why it would.
[1] https://news.ycombinator.com/item?id=40523633
Its a Horn clause resolver...that's exactly the kind of reasoning that LLMs are bad at. I have no idea how to graft Prolog to an LLM but if you can graft any programming language to it, you can graft Prolog more easily.
Also, that you push Python and JavaScript makes me think you don't know many languages. Those are terrible languages to try to graft to anything. Just because you only know those 2 languages doesn't make them good choices for something like this. Learn a real language Physicist.
> Also, that you push Python and JavaScript
I didn't push them.
> Those are terrible languages to try to graft to anything.
Web browsers, Blender, LibreOffice and Excel all use those languages for embedded scripting. They're fine.
> Just because you only know those 2 languages doesn't make them good choices for something like this.
You misunderstood my claim and are refuting something different. I said there is more training data for LLMs to use to generate Python and JavaScript, than Prolog.
I'm not. Python and JS are scripting languages. And in this case, we want something that models formal logic. We are hammering in a nail, you picked up a screwdriver and I am telling you to use a claw hammer.
5 replies →
No call for talking down at people. No one has ever been convinced by being belittled.
> I have no idea how to graft Prolog to an LLM
Wrapping either the SWI prolog MQI, or even simpler an existing Python interface like like janus_swi, in a simple MCP is probably an easy weekend project. Tuning the prompting to get an LLM to reliably and effectively choose to use it when it would benefit from symbolic reasoning may be harder, though.
We would begin by having a Prolog server of some kind (I have no idea if Prolog is parallelized but it should very well be if we're dealing with Horn Clauses).
There would be MCP bindings to said server, which would be accessible upon request. The LLM would provide a message, it could even formulate Prolog statements per a structured prompt, and then await the result, and then continue.
> Its a Horn clause resolver...that's exactly the kind of reasoning that LLMs are bad at. I have no idea how to graft Prolog to an LLM but if you can graft any programming language to it, you can graft Prolog more easily.
By grafting LLM into Prolog and not other way around ?
This sparked a really fascinating discussion, I don't know if anyone will see this but thanks everyone for sharing your thoughts :)
I understand your point - to an LLM there's no meaningful difference between once turing complete language and another. I'll concede that I don't have a counter argument, and perhaps it doesn't need to be prolog - though my hunch is that LLM's tend to give better results when using purpose built tools for a given type of problem.
The only loose end I want to address is the idea of "doing reasoning."
This isn't an AGI proposal (I was careful to say "good at writing prolog") just an augmentation that (as a user) I haven't yet seen applied in practice. But neither have I seen it convincingly dismissed.
The idea is the LLM would act like an NLP parser that gradually populates a prolog ontology, like building a logic jail one brick at a time.
The result would be a living breathing knowledge base which constrains and informs the LLM's outputs.
The punchline is that I don't even know any prolog myself, I just think it's a neat idea.
Of course it does "reasoning", what do you think reasoning is? From a quick google: "the action of thinking about something in a logical, sensible way". Prolog searches through a space of logical proposition (constraints) and finds conditions that lead to solutions (if one exists).
(a) Trying adding another 100 or 1000 interlocking proposition to your problem. It will find solutions or tell you one doesn't exist. (b) You can verify the solutions yourself. You don't get that with imperative descriptions of problems. (b) Good luck sandboxing Python or JavaScript with the treat of prompt injection still unsolved.
Of course it doesn't "do reasoning", why do you think "following the instructions you gave it in the stupidest way imaginable" is 'obviously' reasoning? I think one definition of reasoning is being able to come up with any better-than-brute-force thing that you haven't been explicitly told to use on this problem.
Prolog isn't "thinking". Not about anything, not about your problem, your code, its implementation, or any background knowledge. Prolog cannot reason that your problem is isomorphic to another problem with a known solution. It cannot come up with an expression transform that hasn't been hard-coded into the interpreter which would reduce the amount of work involved in getting to a solution. It cannot look at your code, reason about it, and make a logical leap over some of the code without executing it (in a way that hasn't been hard-coded into it by the programmer/implementer). It cannot reason that your problem would be better solved with SLG resolution (tabling) instead of SLD resolution (depth first search). The point of my example being pseudo-Python was to make it clear that plain Prolog (meaning no constraint solver, no metaprogramming), is not reasoning. It's no more reasoning than that Python loop is reasoning.
If you ask me to find the largest Prime number between 1 and 1000, I might think to skip even numbers, I might think to search down from 1000 instead of up from 1. I might not come up with a good strategy but I will reason about the problem. Prolog will not. You code what it will do, and it will slavishly do what you coded. If you code counting 1-1000 it will do that. If you code Sieve of Eratosthenes it will do that instead.
The disagreement you have with the person you are relying to just boils down to a difference in the definition of "reasoning."
Its a Horn clause interpreter. Maybe lookup what that is before commenting on it. Clearly you don't have a good grasp of Computer Science concepts or math based upon your comments here. You also don't seem to understand the AI/ML definition of reasoning (which is based in formal logic, much like Prolog itself).
Python and Prolog are based upon completely different kinds of math. The only thing they share is that they are both Turing complete. But being Turing complete isn't a strong or complete mathematical definition of a programming language. This is especially true for Prolog which is very different from other languages, especially Python. You shouldn't even think of Prolog as a programming language, think of it as a type of logic system (or solver).
2 replies →
Contrary to what everyone else is saying, I think you're completely correct. Using it for AI or "reasoning" is a hopeless dead end, even if people wish otherwise. However I've found that Prolog is an excellent language for expressing certain types of problems in a very concise way, like parsers, compilers, and assemblers (and many more). The whole concept of using a predicate in different modes is actually very useful in a pragmatic way for a lot of problems.
When you add in the constraint solving extensions (CLP(Z) and CLP(B) and so on) it becomes even more powerful, since you can essentially mix vanilla Prolog code with solver tools.
The reason why you can write parsers with Prolog is because you can cast the problem of determining whether a string belongs to a language or not as a proof, and, in Prolog, express it as a set of Definite Clauses, particularly with the syntactic sugar of Definite Clause Grammars that give you an executable grammar that acts as both acceptor and generator and is equivalent to a left-corner parser.
Now, with that in mind, I'd like to understand how you and the OP reconcile the ability to carry out a formal proof with the inability to do reasoning. How is it not reasoning, if you're doing a proof? If a proof is not reasoning, then what is?
Clearly people write parsers in C and C++ and Pascal and OCAML, etc. What does it mean to come in with "the reason you can write parsers with Prolog..."? I'm not claiming that reason is incorrect, I'm handwaving it away as irrelevant and academic. Like saying that Lisp map() is better than Python map() because Lisp map is based on formal Lambda Calculus and Python map is an inferior imitation for blub programmers. When a programmer maps a function over a list and gets a result, it's a distinction without a difference. When a programmer writes a getchar() peek() and goto state machine parser with no formalism, it works, what difference does the formalism behind the implementation practically make?
Yes maybe the Prolog way means concise code is easier for a human to tell whether the code is a correct expression of the intent, but an LLM won't look at it like that. Whatever the formalism brings, it isn't enough that every parser task is done in Prolog in the last 50 years. Therefore it isn't any particular interest or benefit, except academic.
> both acceptor and generator
Also academically interesting but practically useless due to the combinatorial explosion of "all possible valid grammars" after the utterly basic "aaaaabbbbbbbbbbbb" examples.
> "how you and the OP reconcile the ability to carry out a formal proof with the inability to do reasoning. How is it not reasoning, if you're doing a proof? If a proof is not reasoning, then what is?"
If drawing a painting is art, is it art if a computer pulls up a picture of a painting and shows it on screen? No. If a human coded the proof into a computer, the human is reasoning, the computer isn't. If the computer comes up with the proof, the computer is reasoning. Otherwise you're in a situation where dominos falling over is "doing reasoning" because it can be expressed formally as a chain of connected events where the last one only falls if the whole chain is built properly, and that's absurdum.
13 replies →
Even in your example (which is obviously not correct representation of prolog), that code will work X orders magnitude faster and with 100% reliability compared to much more inferior LLM reasoning capabilities.
This is not the point though
Algorithmically there's nothing wrong with using BFS/DFS to do reasoning as long as the logic is correct and the search space is constrained sufficiently. The hard part has always been doing the constraining, which LLMs seem to be rather good at.
> This is not the point though
could you expand what is the point? That authors opinion without much justification is that this is not reasoning?
What makes you think your brain isn't also brute forcing potential solutions subconciously and only surfacing the useful results?
Because I can solve problems that would take the age of the universe to brute force, without waiting the age of the universe. So can you: start counting at 1, increment the counter up to 10^8000, then print the counter value.
Prolog: 1, 2, 3, 4, 5 ...
You and me instantly: 10^8000
The brain can still use other means of working in addition to brute forcing solutions. For example, how would you go about solving the chess puzzle of eight queens that doesn't involve going through the potential positions and then filtering out the options that don't match the criteria for the solution?
Prolog can also evaluate mathematical expressions directly as well.
There's a whole lot of undecidable (or effectively undecidable) edge cases that can be adequately covered. As a matter of fact, Decidability Logic is compatible with Prolog.
Can you try calculating 101 * 70 in your head?
I think therefore I am calculator?
Very easy to solve, just like it is easy to solve many other ones once you know the tricks.
I recommend this book: https://www.amazon.com/Secrets-Mental-Math-Mathemagicians-Ca...
10 replies →
I can absolutely try this. Doesn't mean i'll solve it. If i solve it there's no guarantee i'll be correct. Math gets way harder when i don't have a legitimate need to do it. This falls in the "no legit need" so my mind went right to "100 * 70, good enough."
1 reply →
Um, that's really easy to do in your head, there's no carrying or anything? 7,070
7 * 101 = 707 * 10 = 7,070
And computers don't brute-force multiplication either, so I'm not sure how this is relevant to the comment above?
4 replies →
human brains are insanely powerful pattern matching and shortcut-taking machines. There's very little brute forcing going on.
Your second sentence contradicts your first.
3 replies →
Just intuition ;)
Everything you've written here is an invalid over-reduction, I presume because you aren't terribly well versed with Prolog. Your simplification is not only outright erroneous in a few places, but essentially excludes every single facet of Prolog that makes it a turing complete logic language. What you are essentially presenting Prolog as would be like presenting C as a language where all you can do is perform operations on constants, not even being able to define functions or preprocessor macros. To assert that's what C is would be completely and obviously ludicrous, but not so many people are familiar enough with Prolog or its underlying formalisms to call you out on this.
Firstly, we must set one thing straight: Prolog definitionally does reasoning. Formal reasoning. This isn't debatable, it's a simple fact. It implements resolution (a computationally friendly inference rule over computationally-friendly logical clauses) that's sound and refutation complete, and made practical through unification. Your example is not even remotely close to how Prolog actually works, and excludes much of the extra-logical aspects that Prolog implements. Stripping it of any of this effectively changes the language beyond recognition.
> Plain Prolog's way of solving reasoning problems is effectively:
No. There is no cognate to what you wrote anywhere in how Prolog works. What you have here doesn't even qualify as a forward chaining system, though that's what it's closest to given it's somewhat how top-down systems work with their ruleset. For it to even approach a weaker forward chaining system like CLIPS, that would have to be a list of rules which require arbitrary computation and may mutate the list of rules it's operating on. A simple iteration over a list testing for conditions doesn't even remotely cut it, and again that's still not Prolog even if we switch to a top-down approach by enabling tabling.
> You hard code some options
A Prolog knowledgebase is not hardcoded.
> write a logical condition with placeholders
A horn clause is not a "logical condition", and those "placeholders" are just normal variables.
> and Prolog brute-forces every option in every placeholder.
Absolutely not. It traverses a graph proving things, and when it cannot prove something it backtracks and tries a different route, or otherwise fails. This is of course without getting into impure Prolog, or the extra-logical aspects it implements. It's a fundamentally different foundation of computation which is entirely geared towards formal reasoning.
> And that might have been nice compared to Pascal in 1975, it's not so different to modern garbage collected high level scripting languages.
It is extremely different, and the only reason you believe this is because you don't understand Prolog in the slightest, as indicated by the unsoundness of essentially everything you wrote. Prolog is as different from something like Javascript as a neural network with memory is.
The original suggestion was that LLMs should emit Prolog code to test their ideas. My reply was that there is nothing magic in Prolog which would help them over any other language, but there is something in other languages which would help them over Prolog - namely more training data. My example was to illustrate that, not to say Prolog literally is Python. Of course it's simplified to the point of being inaccurate, it's three lines, how could it not be.
> "A Prolog knowledgebase is not hardcoded."
No, it can be asserted and retracted, or consult a SQL database or something, but it's only going to search the knowledge the LLM told it to - in that sense there is no benefit to an LLM to emit Prolog over Python since it could emit the facts/rules/test cases/test conditions in any format it likes, it doesn't have any attraction to concise, clean, clear, expressive, output.
> "those "placeholders" are just normal variables"
Yes, just normal variables - and not something magical or special that Prolog has that other languages don't have.
> "Absolutely not. It traverses a graph proving things,"
Yes, though, it traverses the code tree by depth first walk. If the tree has no infinite left-recursion coded in it, that is a brute force walk. It proves things by ordinary programmatic tests that exist in other languages - value equality, structure equality, membership, expression evaluation, expression comparison, user code execution - not by intuition, logical leaps, analogy, flashes of insight. That is, not particularly more useful than other languages which an LLM could emit.
> "Your example is not even remotely close to how Prolog actually works"
> "There is no cognate to what you wrote anywhere in how Prolog works"
> "It is extremely different"
Well:
That's a loop over the people, filling in the variable X. Prolog is not looking at Ancestry.com to find who Timmy's parents are. It's not saying "ooh you have a SQLite database called family_tree I can look at". That it's doing it by a different computational foundation doesn't seem relevant when that's used to give it the same abilities.
My point is that Prolog is "just" a programming language, and not the magic that a lot of people feel like it is, and therefore is not going to add great new abilities to LLMs that haven't been discovered because of Prolog's obscurity. If adding code to an LLM would help, adding Python to it would help. If that's not true, that would be interesting - someone should make that case with details.
> "and the only reason you believe this is because you don't understand Prolog in the slightest"
This thread would be more interesting to everybody if you and hunterpayne would stop fantasizing about me, and instead explain why Prolog's fundamentally different foundation makes it a particularly good language for LLMs to emit to test their other output - given that they can emit virtually endless quantities of any language, custom writing any amount of task-specific code on the fly.
The discussion has become contentious and that's very unfortunate because there's clearly some confusion about Prolog and that's always a great opportunity to learn.
You say:
>> Yes, though, it traverses the code tree by depth first walk.
Here's what I suggest: try to think what, exactly, is the data structure searched by Depth First Search during Prolog's execution.
You'll find that this structure is what we call and SLD-Tree. That's a tree where the root is a Horn goal that begins the proof (i.e. the thing we want to dis-prove, since we're doing a proof by refutation); every other node is a new goal derived during the proof; every branch is a Resolution step between one goal and one definite program clause from a Prolog program; and every leaf of a finite branch is either the empty clause, signalling the success of the proof by refutation, or a non-empty goal that can not be further reduced, which signals the failure of the proof. So that's basically a proof tree and the search is ... a proof.
So Prolog is not just searching a list to find an element, say. It's searching a proof tree to find a proof. It just so happens that searching a proof tree to find a proof corresponds to the execution of a program. But while you can use a search to carry out a proof, not every search is a proof. You have to get your ducks in a row the right way around otherwise, yeah, all you have is a search. This is not magick, it's just ... computer science.
It should go without saying that you can do the same thing with Python, or with javascript, or with any other Turing-complete language, but then you'd basically have to re-invent Prolog, and implement it in that other language; an ad-hoc, informally specified, bug-ridden and slow implementation of half of Prolog, most like.
This is all without examining whether you can fix LLMs' lack of reasoning by funneling their output through a Prolog interpreter. I personally don't think that's a great idea. Let's see, what was that soundbite... "intelligence is shifting the test part of generate-test into the generate part" [1]. That's clearly not what pushing LLM output into a Prolog interpreter achieves. Clearly, if good, old-fashioned symbolic AI has to be combined with statistical language modelling, that has to happen much earlier in the statistical language modelling process. Not when it's already done and dusted and we have a language model; which is only statistical. Like putting the bubbles in the soda before you serve the drink, not after, the logic has to go into the language modelling before the modelling is done, not after. Otherwise there's no way I can see that the logic can control the modelling. Then all you have is generate-and-test, and it's meh as usual. Although note that much recent work on carrying out mathematical proofs with LLMs does exactly that, e.g. like DeepMind's AlphaProof. Generate-and-test works, it's just dumb and inefficient and you can only really make it work if you have the same resources as DeepMind and equivalent.
_____________
[1] Marvin Minsky via Rao Kampampathi and students: https://arxiv.org/html/2504.09762v1
This is a philosophical argument.
The way to look at this is first to pin down what we mean when we say Human Commonsense Reasoning (https://en.wikipedia.org/wiki/Commonsense_reasoning). Obviously this is quite nebulous and cannot be defined precisely but OG AI researchers have a done a lot to identify and formalize subsets of Human Reasoning so that it can be automated by languages/machines.
See the section Successes in automated commonsense reasoning in the above wikipedia page - https://en.wikipedia.org/wiki/Commonsense_reasoning#Successe...
Prolog implements a language to logically interpret only within a formalized subset of human reasoning mentioned above. Now note that all our scientific advances have come from our ability to formalize and thus automate what was previously only heuristics. Thus if i were to move more of real-world heuristics (which is what a lot of human reasoning consists of) into some formal model then Prolog (or say LLMs) can be made to better reason about it.
See the paper Commonsense Reasoning in Prolog for some approaches - https://dl.acm.org/doi/10.1145/322917.322939
Note however the paper beautifully states at the end;
Prolog itself is all form and no content and contains no knowledge. All the tasks, such as choosing a vocabulary of symbols to represent concepts and formulating appropriate sentences to represent knowledge, are left to the users and are obviously domain-dependent. ... For each particular application, it will be necessary to provide some domain-dependent information to guide the program writing. This is true for any formal languages. Knowledge is power. Any formalism provides us with no help in identifying the right concepts and knowledge in the first place.
So Real-World Knowledge encoded into a formalism can be reasoned about by Prolog. LLMs claim to do the same on unstructured/non-formalized data which is untenable. A machine cannot do "magic" but can only interpret formalized/structured data according to some rules. Note that the set of rules can be dynamically increased by ML but ultimately they are just rules which interact with one another in unpredictable ways. Now you can see where Prolog might be useful with LLMs. You can impose structure on the view of the World seen by the LLM and also force it to confine itself only to the reasoning it can do within this world-view by asking it to do predominantly Prolog-like reasoning but you don't turn the LLM into just a Prolog interpreter. We don't know how it interacts with other heuristics/formal reasoning parts (eg. reinforcement learning) of LLMs but does seem to give better predictable and more correct output. This can then be iterated upon to get a final acceptable result.
PS: You might find the book Thinking and Deciding by Jonathan Baron useful for background knowledge - https://www.cambridge.org/highereducation/books/thinking-and...