Comment by mepian
2 years ago
I wonder what is the closest thing to Cyc we have in the open source realm right now. I know that we have some pretty large knowledge bases, like Wikidata, but what about expert system shells or inference engines?
2 years ago
I wonder what is the closest thing to Cyc we have in the open source realm right now. I know that we have some pretty large knowledge bases, like Wikidata, but what about expert system shells or inference engines?
OWL and SPARQL inference engines that use RDF and DSMs - there are LISPy variants like datadog still kicking around, but there are some great, high performance reasoner FOSS projects, like StarDog or Neo4j
https://github.com/orgs/stardog-union/
Looks like Knowledge Graph and semantic reasoner are the search terms du'jour, I haven't tracked these things since OpenCyc stopped being active.
Humans may not be able to effectively trudge through the creation of trillions of little rules and facts needed for an explicit and coherent expert world model, but LLMs definitely can be used for this.
You can actually do "inference" or "deduction" over large amounts of data using any old-fashioned RDBMS, and get broadly equal or better performance than the newfangled "graph" based systems. Graph databases may be a clear win for very specialized network analysis, but that is rare in practice.
Graph databases win in flexibility and ease of writing the queries over graphs, honestly.
Of course the underlying storage can be (and often is) a bunch of specially prepared relational tables.
But the strength in graph databases comes from restating the problem in different way, with query languages targeting the specific problem space.
Similarly there are tasks where SQL will be plainly better.
5 replies →
> high performance reasoner FOSS projects, like StarDog or Neo4j
StarDog is not FOSS, that github repo is for various utils around their proprietary package in my understanding, actual engine code is not open source.
Did you mean Datalog here?
There are some pretty huge ontology DBs in molecular biology, like GO or Reactome.
But they have never truly exploited logic-based inference, except for some small academic efforts.
I think that GO with GO-CAM is definitely going that way. Basic GO is rather simple and can't infer that much (as in GO by itself has low classification or inference logic build in). Uberon, for anatomy, does use a lot of OWL power and shows that the logic-based inference can help a lot.
Reactome, is a graph, because that is the domain. But technically it does little with that fact (In my disappointed opinion).
Given that GO and Reactome are also relatively small academic efforts in general...
There are a few symbolic logic entailment engines that run atop OWL the Web Ontology Language, some flavors of which which are rough equivalent of Cycs KBs. The challenge though is the underlying approaches are computationally hard so nobody really uses them in practice, plus the retrieval language associated with OWL is SPARQL which also has little traction.
At a much lower level, I've been having fun hacking away at my Concludia side project over time. It's purely proposition level and will eventually support people being able to create their own arguments and contest others. http://concludia.org/
Nice! I've wanted to build something like this for a long time. It requires good faith argument construction from both parties, but it's useful to make the possibility available when you do find the small segment of people who can do that.
Very cool! I've had this idea for 20 years. I'm glad I didn't get around to making it.
I tried to make something along these lines (https://truebase.treenotation.org/).
My approach, Cyc's, and others are fundamentally flawed for the same reason. There's a low level reason why deep nets work and symbolic engines are very bad.
And what is that reason?
The language before language.
2 replies →
Err…OpenCyc?
Yep. And it may be just a subset, but it's pretty much the answer to
"I wonder what is the closest thing to Cyc we have in the open source realm right now?".
See:
https://github.com/therohk/opencyc-kb
https://github.com/bovlb/opencyc
https://github.com/asanchez75/opencyc
Outside of that, you have the entire world of Semantic Web projects, especially things like UMBEL[1], SUMO[2], YAMATO[3], and other "upper ontologies"[4] etc.
[1]: https://en.wikipedia.org/wiki/UMBEL
[2]: https://en.wikipedia.org/wiki/Suggested_Upper_Merged_Ontolog...
[3]: https://ceur-ws.org/Vol-2050/FOUST_paper_4.pdf
[4]: https://en.wikipedia.org/wiki/Upper_ontology
Unfortunately no longer (officially) available; and only a small subset anyway.