← Back to context

Comment by mananaysiempre

1 day ago

A bit disappointing that (IIUC) for the common parsers you have to say everything twice, in HTML and in the accompanying JSON-LD form even though RDFa exists for the exact purpose of letting you point at the values already present in your markup. (Admittedly RDFa is perhaps too flexible for its own good when you just want to mark up some stuff, but if you’re writing a full parser anyway dealing with a bit of excessive cleverness in the format should not be too bad.)

And then there is https://schema.org/ It's the item* attributes, e.g.: https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/... Also Dublin Core in <meta> tags. Why do they keep adding conflicting meta data formats to HTML!?!

  • They don't conflict; they were designed to work together. You can have schema.org (in JSON-LD, RDFa, or micro data) on the same page as Dublin Core, etc.

    For example, there's no explicit property in schema's Person type [1] for a nickname. But the FOAF standard does [2].

    Just add FOAF to the JSON-LD context:

        {
          "@context": {
            "@vocab": "https://schema.org/",
            "foaf": "http://xmlns.com/foaf/0.1/",
            "pronouns": "https://schema.org/pronouns" 
        }
    
    
    

    You now use the FOAF nickname property:

        "@type": "Person",
          "givenName": "Timothy",
          "familyName": "Berners-Lee",
          "foaf:nick": "TBL",
    

    You can do the same thing with Dublin Core, DBPedia, etc.

    [1]: https://schema.org/Person

    [2]: https://xmlns.com/foaf/spec/#term_nick

  • I think if you are using Dublin Core, it’s because you’re a library. Maybe I am off the mark, but that is the sense I get from this—not all these standards should be used for all pages on the web.

    I think you should just think about what metadata you actually care about, and the main metadata I care about (choose your own list) is authorship, publish date, last update, subject keywords, thumbnail (OpenGraph 1200x630), and summary.

    There’s a long list of additional metadata that I could put in my webpages because there are standardized ways to do it, but, why bother?

    • Dublin Core is effectively similar/related to schema.org's CreativeWork. If you have a creative work (audiobook, short story, news article, etc.) then Dublin Core is applicable, in addition to the corresponding CreativeWork subtype.

      And yes, you should use whatever metadata is applicable to your site and test it against the search engines/etc. you want to support to make sure that they are reading the metadata correctly.

  • To be fair schema.org and dublin core say “when a property is name ‘title’ it means …” and you can expect to find the following properties…

    Json-ld says: if you want to know whether the “title” property means the schema.org or the dublin core variant then you can find out which it is by <json-ld algorithm>

    So you’d always use json-ld _with_ schema.org or something.

IMO this is going overboard. Any time you are duplicating data from HTML into JSON-LD, consider just omitting that data from JSON-LD, unless the data isn’t consistently present in HTML (because it is a bitch to be consistent about this stuff).

I tried using RDFa and liked the property that it was theoretically less redundant, but switched to JSON-LD because it JSON-LD is just easier to get working. And this is speaking as somebody who uses a hand-rolled static site generator—the issue here is that whether information is present in the raw HTML is something contextual, and if something isn’t present in the HTML then you need to put it somewhere else or it’s not mechanically parseable from the page. Like, to a human reader, a post on “Alice’s Blog” is assumed to be authored by Alice, so I may omit the “by Alice” text from the document, and then I would want to put that metadata in the page some other way.

Putting the metadata in JSON-LD lets me just be dumb about it. The metadata is always in JSON-LD, and the HTML may or may not contain an explicit representation of that same metadata. Easy.

But the JSON-LD does not need to contain the URL of the page (which is <link rel=canonical>) or the title (which is in <title>), for example.

  • > I tried using RDFa and liked the property that it was theoretically less redundant, but switched to JSON-LD because it JSON-LD is just easier to get working.

    For me, it depends on the project. For personal projects, I tend to use RDFa; otherwise, JSON-LD.

I solved this by building Web Components out of them. Basically the HTML needs just a custom template tag, which includes a script with the JSON-LD payload. The component corresponding to the template, initializes itself based on that data. See here for an example: https://releases.bruta.link/releases/2026/June/21

Granted, all of this is not for SEO purposes, but part of the ActivityPub ecosystem, which also uses JSON-LD for data encoding.