← Back to context

Comment by breck

2 years ago

I think before 2022 it was still an open question whether it was a good approach.

Now it's clear that knowledge graphs are far inferior to deep neural nets, but even still few people can explain the _root_ reason why.

I don't think Lenat's bet was a waste. I think it was sensible based on the information at the time.

The decision to research it largely in secret, closed source, I think was a mistake.

I assume the problem with symbolic inference is that from a single inconsistent premise logic can produce any statement possible.

If that is so then symbolic AI does not easily scale because you cannot feed inconsistent information into it. Compare this to how humans and LLMs learn, they both have no problem with inconsistent information. Yet statistically speaking humans can easily produce "useful" information.

  • Based on the article, it seems like the Cyc had ways to deal with inconsistency. I don't know the details of how they did it, but Paraconsistant Logics [0] provide a general way to prevent any statement from being provable from an inconsistency.

    [0] https://en.wikipedia.org/wiki/Paraconsistent_logic

    • Interesting, from the article:

      "Mathematical framework and rules of paraconsistent logic have been proposed as the activation function of an artificial neuron in order to build a neural network"

  • > Compare this to how humans and LLMs learn, they both have no problem with inconsistent information.

    I don't have time to fully refute this claim, but it is very problematic.

    1. Even a very narrow framing of how neural networks deal with inconsistent training data would perhaps warrant a paper if not a Ph.D. thesis. Maybe this has already been done? Here is the problem statement: given a DNN with a given topology trained with SGD and a given error function, what happens when you present flatly contradictory training examples? What happens when the contradiction doesn't emerge until deeper levels of a network? Can we detect this? How?

    2. Do we really _want_ systems that passively tolerate inconsistent information? When I think of an ideal learning agent, I want one that would engage in the learning process and seek to resolve any apparent contradictions. I haven't actively researched this area, but I'm confident that some have, if only because Tom Mitchell at CMU emphasizes different learning paradigms in his well-known ML book. So hopefully enough people reading that think "yeah, the usual training methods for NNs aren't really that interesting ... we can do better."

    3. Just because humans 'tolerate' inconsistent information in some cases doesn't mean they do so well, as compared to ideal Bayesian agents.

    4. There are "GOFAI" algorithms for probabilistic reasoning that are in many cases better than DNNs.

> Now it's clear that knowledge graphs are far inferior to deep neural nets

No. It depends. In general, two technologies can’t be assessed independently of the application.

  • Anything other than clear definitions and unambiguous axioms (which happens to be most of the real world) and gofai falls apart. Like it can't even be done. There's a reason it was abandoned in NLP long before the likes of GPT.

    There aren't any class of problems deep nets can't handle. Will they always be the most efficient or best performing solution ? No, but it will be possible.

    • > Anything other than clear definitions and unambiguous axioms (which happens to be most of the real world) and gofai falls apart.

      You've overstated/exaggerated the claim. A narrower version of the claim is more interesting and more informative. History is almost never as simple as you imply.

      1 reply →

    • > There aren't any class of problems deep nets can't handle. Will they always be the most efficient or best performing solution ? No, but it will be possible.

      This assumes that all classes of problems reduce to functions which can be approximated, right, per the universal approximation theorems?

      Even for cases where the UAT applies (which is not everywhere, as I show next), your caveat understates the case. There are dramatically better and worse algorithms for differing problems.

      But I think a lot of people (including the comment above) misunderstand or misapply the UATs. Think about the assumptions! UATs assume a fixed length input, do they not? This breaks a correspondence with many classes of algorithms.*

      ## Example

      Let's make a DNN that sorts a list of numbers, shall we? But we can't cheat and only have it do pairwise comparisons -- that is not the full sorting problem. We have to input the list of numbers and output the list of sorted numbers. At run-time. With a variable-length list of inputs.

      So no single DNN will do! For every input length, we would need a different DNN, would we not? Training this collection of DNNs will be a whole lot of fun! It will make Bitcoin mining look like a poster-child of energy conservation. /s

      * Or am I wrong? Is there a theoretical result I don't know about?

      1 reply →

> but even still few people can explain the _root_ reason why.

_The_ (one) root reason? Ok, I’ll bite.

But you need to define your claim. What application?

  • > _The_ (one) root reason? Ok, I’ll bite.

    A "secret" hiding in plain sight.

    • Does writing riddles help anyone? Maybe it helps you, maybe by giving you a smirk or dopamine hit? Think about others, please.

      What is obvious to you is not obvious to others. I recommend explaining and clarifying if you care about persuasion.

      1 reply →