← Back to context

Comment by lolive

11 hours ago

The concept will re-emerge somehow. Webpages are 99.99% of the time the formatting of a data structure for humans. LLM can barely infer that data structure from the webpage and connect it with other data structure of other pages. [truth is that the LLM algorithm does not do that AT ALL internally, but from our user experience it really looks like it does].

But when webpages die and data is accessed only by machine2machine APIs, we will no longer have this formatting for humans. Then we will need API-literate LLMs. Which means LLMs that can connect the dots between shitloads of unconnected JSONs. And if we don’t hint it for which connections are existing between that chaos of APIs, it will not be able to apply its magic. In short: we need to be able to bring JSON to vector space. And it is absolutely not meant for that, by default.

I agree that something like it will re-emerge. But I also think the semantic web has always been misunderstood and misapplied even by its proponents.

In my view, semantic web technologies should have been used to make databases interoperable, not to turn the hypertext web into an incredibly incomplete distributed database without any data quality process.

  • Are you referring to ActivityPub traffic (Mastodon, etc.)? Yes they're nominally using JSON-LD, but actually most devs seem to not have understood that ActivityStreams is just a projection of RDF triples into JSON. Instead they go with the part they did unterstand (because JSON is better than markup right?), and end up tunneling markdown or HTML through JSON strings and uneccessarily hardcoding their payloads in ORM layers in dynamic languages. If I were mean, I'd compare the situation to insects incapable of comprehending a 3D universe, clinging to syntactic surfaces that seem familiar.

    But what can you do? At this point, keeping federated alternatives, protocol-first designs, and multiple interworking implementations is more important than purity; it might well be the last successful initiative of its kind.

    • >Are you referring to ActivityPub traffic (Mastodon, etc.)?

      No, I wasn't even aware that they use anything RDF related.

  • I work with Palantir Foundry stack, and I awfully think that this is the best implementation of semantic web principles I could ever imagine.

    And the current trend is really to connect the AI layer of Foundry with the ontology layer.

    Note: after rereading your comment, I must admit that Foundry enforces data co-locality and model co-locality (==a unified centrally managed ontology). Which are NOT what the semantic web wanted.