Things Markdown got wrong

6 years ago (swyx.io)

I wish that markdown formatted bold and italics like this:

  _italic_

  *bold*

instead of:

  _italic_

  *italic*

  __bold__

  **bold**

because stars around a word make it look like it's glowing, sort of like bold does, and underlining a word means to make it italic. Does no one remember this?

In English class, before computers, our teacher told us that certain words are always underlined: book titles, ship names, etc.

In Typing class, our teacher told us that if you submitted a typewritten manuscript to a publisher, what you underlined with your typewriter would be converted to italics in a typeset book. Underlining meant italics. And sure enough, nowadays when everyone has a computer instead of a typewriter, and can make real italics, they tell us that book titles, ship names, etc., ought to be in italics (the same things that my teacher used to tell us should be underlined) https://www.purchase.edu/editorial-style-guide/general-style...

I get kind of why Gruber did it his way. He was caught up in the Semantic Web, where you don't use <i> and <b>, you use <em> and <strong>, and <strong> means <em> only more so. So by that logic

  *this*

is emphatic, and

  **this**

is strongly emphatic.

  • I'm enamored of the org-mode textual formatting convention: •bold︎•, /italic/, _underline_, =literal=, and ~strikethrough~. Although `literal` a la Markdown is arguably better, I prefer the other choices.

    Especially because org-mode lets you italicize //Face/Off// by adding more slashes, and so on for the others, like ==e = mc^2==.

    Incidentally, I have no idea how you're typing literal asterisks, I gave up and am kinda jealous.

  • Moreover, the convention that you advocate is more or less an established practice in Usenet and mailing lists, long preceding markdown. This design decision in markdown is uninformed.

  • Two minor notes:

    1. "Semantic web" refers to data being linked and having meaning. You're referring to semantic HTML.

    2. I don't think <strong> meant <em> but more so - as I understood it, <strong> meant text that had to stand out from the rest, whereas <em> meant that text was emphasised. Though I might be confusing it now with the retro-actively applied "semantics" of <b> and <i>.

    Totally agree with the overall point btw, and I do tend to use them in the way you described.

  • It's actually surprisingly difficult to get this kind of thing right. It seems perfectly obvious in hindsight, but while you're designing, little details like this get missed.

    When I started writing a spec for Concise Text Encoding [1], I figured it would take about 6 months to nail it down (yeah, right). Even now, 2 years later, I'm still making amendments to it because of various things I missed or got wrong. It's either something I happen to notice in one of my many, many re-reads, or something that another person notices after a few minutes reading it over, or something that comes up while I'm writing the reference implementation. Getting a spec right is a marathon affair.

    [1] https://github.com/kstenerud/concise-encoding/blob/master/ct...

    • It seems to be an instance of Wadler's Law.

      > In any language design, the total time spent discussing > a feature in this list is proportional to two raised to > the power of its position. > 0. Semantics > 1. Syntax > 2. Lexical syntax > 3. Lexical syntax of comments

      https://wiki.haskell.org/Wadler's_Law

  • Agree in general, but don't like the underline/italic thing. Most word processors have underline as separate from underline. Hence the alternative old email/Usenet convention: underscores for underline, slashes for italic.

    • Professional typesetters have an aversion to underlining text. It's ugly. The line interferes with descending letters: f, g, j, p, q, y. The underline is as thick as the stroke of a character, so decorative marks compete with signal marks. It's just too strong. That's why you see text underlined in homemade flyers but almost never in books, magazines, newspapers, or other things designed by pros. The recent addition to CSS to specify that underlines go behind the descenders is a welcome change.

      Professional web designers have the same aversion that print designers have, they are birds of a feather. But now they have another reason to hate it. Underlining has come to mean hyperlink. Because of their original aversion, they will often use CSS to remove that default underline, maybe bringing it back only when you hover over a link. But because it is still a widespread convention throughout the web, and probably always will be, they would never, ever emphasize something by underlining (they use border-bottom ;).

      Since you should underline something only if it is a link, which has its own markdown, and since underlining is ugly anyway, and since bold and italics are enough ammunition for all your emphasizing needs, and since it is an established convention from the old days that underlining meant eventual italics when it went to press, it's okay in my book to repurpose _underscores_ for italics.

      Slashes instead on both sides like /this/ --- hmm, it's not bad. I'm undecided.

  • Because calling it italic and bold is WYSIWYG all over again something Markdown tried to cure.

    The proper way to think about is s emphasis and strong emphasis. As user you shouldn't be concerned by how it looks. Thats job of overall design and that might change depending where you publish / who designed it. You shouldn't make that decision because you don't have the required context.

    Italic will probably be the emphasis (although other designs are possible like change of color). Underline can be used for strong emphasis if it doesn't colide with links. Bold might be too jarring or not possible to use so designer might decide to not use it.

    There are also other ways to emphesise like small caps or inline background colors.

    • As user you shouldn't be concerned by how it looks. Thats job of overall design and that might change depending where you publish / who designed it. You shouldn't make that decision because you don't have the required context.

      As the writer of Markdown content, you're often both the user of the content and the publisher of the content, so you do care about how it looks. Even if you're just the user, your intention when writing is to communicate something, and the eventual appearance of your content impacts how it is interpreted by the reader. So you still care about how your writing will look, which forces you to care about how the publishing process interprets the markdown syntax.

      1 reply →

    • You have to choose one or the other, semantics or formatting. There's a standard way to format a book title: italics. You have to be able to say "format this with italics" or "this is a book title."

      A system that prefers semantic tags such as "emphasis" and bans direct formatting such as "italics" only works if you can import or define a semantic tag for "book title" that you know will be formatted correctly. It doesn't make sense to tag a book title with "emphasis," because there are a lot of different ways to express emphasis, and only one of them works (quite coincidentally) for a book title.

      1 reply →

    • Everything you say is true. I started making web pages 20 years ago. When I got a real job doing it in the mid-2000s, it was at the height of the push for semantic markup. Over the years I've come to think it is overwrought.

      First, I think they could have repurposed the old tags. <i> can stand for "important," with a default style of italic, and <b> could mean "bold" in the sense of "strong" or "outstanding", https://www.etymonline.com/word/bold, with a default style of boldface type. You can still restyle them to your heart's content.

      Second, italic doesn't always mean emphatic. Like I said, it is also a formal convention for: titles (books, movies, magazines), ships, foreign words, legal cases, etc. (Only of large works. For articles and other small works, you quote them. So for example, The New York Times should be in italics, but any particular article, like "Chiefs Win Superbowl", should be in quotes.)

      Third, the tags are longer, especially <strong> instead of <b>. It's noisy. A quibble, you might say. But so is this whole topic. And there is a line where coding goes from fun to tedious.

      Fourth, an asterisk looks stronger than an underscore anyway:

        *this*
      

      calls out more than

        _this_
      

      So you could say underscores are emphasis and asterisks are stronger emphasis, even with just one on each side.

      1 reply →

> I'm not sure who invented [fenced code blocks], but I definitely know that it is widely used because GitHub Flavored Markdown made it so and I am grateful for that.

I’m pretty sure Vicent Martí (@vmg) was the originator of them. I remember we spent a number of weeks internally at GitHub debating the syntax, since no one really fell in love with any of the proposals for various reasons. Eventually the triple tick won out and that’s worked out fairly well, I think.

  • Man I hate those ticks. I have no idea what arcane key combination on my keyboard I need to do to get them. Maybe it's easier on a US keyboard.

    • It’s very easy on the US keyboard: the key for ticks is under the escape key, no shift needed, just hit this single key.

      I’m German myself, but I use the US keyboard layout for more than 15 years now, because it is so much easier to type things like this... especially in tech. (And also keys for [, ], { and } are much better placed in the US layout)

      18 replies →

    • The single biggest one-time leveling up I did as a programmer, was switching to a US keyboard. All the various symbols are just so much easier to type (except, of course, numbers should probably be shift-typed). You can still use your local keyboard for text...

      3 replies →

    • I always use triple tilde instead, ie ~~~language code ~~~ is the same as ```language code ``` (at least on github)

      Depending on your keyboard it might be easier to type.

    • I'm not in the US and there's a single key that corresponds to a back-tick. In fact, it hasn't yet been an issue for me on keyboard layouts for four different languages (from three separate language families, no less)!

      Where is your keyboard from?

      2 replies →

    • Wiki Creole in 2006 standardized many wiki engines on {{{ }}} code blocks. {{{#!python ...}}} for language or other extensions. All of OP's other points were also solved, at least in the most common implementation. Unfortunately people forgot about this IMO huge improvement.

    • My biggest problem with the ticks is that they're dead keys if you want to be able to write accented characters easily. I write à code sample` way too often.

      1 reply →

    • Very easy on a US keyboard. It's just the top-left key, directly adjacent to '1' and below Escape. No modifiers (shift, whatever) required.

There's always these "Markdown got X wrong posts" and they seem to be very browser centric.

You know how I use Markdown 99% of the time? Vim. I LOVE that it is simple and easy to read in a text editor. I spend a lot of my time in the terminal and that's WHY I use Markdown. If I wanted fancy things that display better in a browser, I'll use something else. I look at Markdown as great because they kept it simple. I can read it easily in vim and I can have some nice features in a browser (i.e. my GitHub repos don't look like they're from 1995. But they also don't look like a Geocities nightmare).

I use Markdown to document code. I use it to keep notes. I use it to track tasks. Etc. It is my digital pen and paper BECAUSE it is simple, because I can read it in the terminal just as easily as I can read it on the web.

Keep Markdown simple.

  • I think your mental accounting of saying you use Markdown 99% of the time in Vim is primarily when you alone are concerned.

    But consider also that you are "using" Markdown when you read Markdown written by others, on documentation, in a browser. A wholistic accounting would probably move Markdown higher than 1% of your time!

    • If it wasn't clear, let me be. Absurdly high numbers typically indicate exaggerations.

  • I also mostly edit Markdown in vim.

    I hate that I have no idea how it will render on Github or wherever without actually pushing a test commit out, or using Github's interactive UI + copypaste to preview.

    There's a bunch of command line tools and libraries implementing a variety of incompatible supersets of markdown.

    Anyway, you can write Markdown-style plaintext documents without trying to conform to Markdown. The point of markdown is to be able to render those straightforward documents as pretty web pages without the degree of uglifying and boilerplate markup that HTML requires.

  • I also use markdown in vim a lot, and some of these "things it got wrong" make it worse, because you end up having to use html to do things like tables. or numbered lists aren't actually numbered in order, etc.

    I'm actually surprised the absence of tables didn't make this list.

What this fails to capture is the original motivation behind the design of Markdown. It was based on things people were already doing to add "markup" to plaintext things like emails. It was designed to just werk to convert your normal email-esque writing style into HTML. This article considers Markdown from the opposite perspective for which it was designed.

In other words, Markdown is not designed to be a simplified markup language for HTML. It's designed to make HTML a frontend for plaintext. The numbered lists with *., for example, would be totally unreadable in plaintext without context.

Org is a way better naturalistic markup format, with way better tooling, but it's hampered by being emacs-only and having weird devotees who like to sell it as some kind of task planning tool when really it's a naturalistic markup format.

I've never liked emacs, but I keep it around for Org, which I use most often as a LaTeX generator without all the ceremony of actual LaTeX.

  • I also love org but it is no replacement for Markdown exactly for the reason you mentioned. Emacs as a way of life is a no go for most developers.

    I may love Emacs but I want to be able to edit my project with something else if need be (IntelliJ IDEA?) and I definitely don't want to force my users to any particular IDE. Making my project only really accessible to Emacs users just to be able to work with document files seems like a very good idea if I want to discourage them from ever looking at it.

    • There's nothing at all tying the org-mode format to emacs. It's a plain text format which any editor supports and syntax highlighting would be just as simple as for markdown.

      5 replies →

My firm belief is that if Markdown had never been adopted, we would have a standard way of formatting text by now supporting these 10 basic formatting types: headings, paragraphs, bold, italic, strike-through, monospace, bullets, links, images and horizontal rules. That's it. Just 10 basic formatting options.

There's no reason why, in 2020, these aren't available everywhere. Just like every text editor can understand emojis (with colors even!), these simple formatting rules should be baked into every platform by now, and easily used on everything from PCs to phones to TVs, in every editor from Vim in a terminal to email to iMessage to Wikipedia's text entry box. Just like emojis are. It's truly insane we haven't standardized this yet. How old is RTF or PDF? Doing this basic formatting in ascii text is just mind-numbingly stupid.

  • At the time when Markdown was created, there were a hundred or two wiki and CMS systems each accepting a different syntax.

    It seems to me that md won out because SO and reddit were using it.

    I remember that I was partial to Creole when I was looking at lightweight markup languages.

    If you're interested in a md-like language with more features have a look at pandoc's markdown.

  • > Doing this basic formatting in ascii text is just mind-numbingly stupid.

    Nope, ascii more or less represents your keyboard. Using ascii as user input has zero dependencies. Are you really suggesting some binary encoding for formatting, that is only possible with the respective tooling? I see only disadvantages in that.

  • I tend to agree with your line of reasoning and wanted to write a blogpost about it eventually.

    I implemented a "markdown-like" parser, and I noticed the biggest problem is that the edge cases are a mess, the notation is a mess, and doing efficient single-pass parsing is unnecessarily complicated (urls are also an unnecessarily complicated element that hits a parser of this kind, but that's a story for another day).

    I generally agree that making special unicode characters to represent these things could go a long way into making everything easier to understand, parse, disambiguate.

    To be fair, this could be tested without insane effort: unicode has a range of characters reserved for private use. Adapt a font and make a simple online editor where holding some key will make javascript generate the appropriate unicode character. We should just try it and see if it's really better.

    If the concept works out well, same as we have modifier keys for alt, shift, ctrl, etc., we could use that and integrate in keyboards without much trouble. Of course, it's a big change, but I think it's being proved that there's a common set of markup needs that most platforms should handle.

    And many might say: well, it's not like everyone needs to implement markdown, we can have a single implementation and re-use it. It's not such a big deal. Well, I'd argue that the right approach is make things as simple as they can be. This kind of markup is necessary, but we can make it much simpler (now someone might say that pushing more things into unicode is not a good idea, as unicode is far from as simple as possible and perfect, but that shall be discussed another day too).

  • > Doing this basic formatting in ascii text is just mind-numbingly stupid.

    People have been typesetting research papers in LaTeX for awhile, and that's not much different. The beauty of plaintext is that it is easy to both read and author, and almost any tool can do the job.

    > Doing this basic formatting in ascii text is just mind-numbingly stupid.

    We write programs in text, and they're capable of exhibiting an unlimited assortment of behaviors.

    Using a wysiwyg tool is burdensome, inaccessible, hard to automate, and doesn't integrate with the myriad of other tools we have available.

  • Windows and macOS do carry italics, strike-throughs, and the like through copy and paste. I think it's the tyranny of the teletype emulator that keeps us doing this. After all, why shouldn't we issue textual commands to a computer in a more sane environment?

    • > Windows and macOS do carry italics, strike-throughs, and the like through copy and paste.

      There is no magic, they do it by literally exchanging HTML on the clipboard. Applications use it as a lingua franca and convert the text into whatever form they use internally.

      2 replies →

  • Markdown supports all of those things and is the standard. So I don't understand what you mean by "if Markdown had never been adopted, we would have a standard..."

  • > Doing this basic formatting in ascii text is just mind-numbingly stupid.

    It might be today, but many of us who are older used Markdown-like formatting even before Markdown was even a thing, because on BBSes and other applications, everything in those days was plain text.

    I don't know Gruber's intentions when he created Markdown, but my guess is he created it for himself first, and then other people adopted it because they think similarly.

  • Emojis are fundamentally different, as they're essentially just codepoints being displayed. That's fairly easy to implement, as every system has been doing that for years. What you're proposing would require massive rewrites of a lot of software.

    I'm also not sure it would even be better; I've never encountered a graphical editor that didn't make me want to throw my computer out the window; stuff like text becoming bold when I don't want it, adding to a list (especially with copy/paste), etc. all tends to be quite annoying and stuff I need to think about. With Markdown, I don't really need to think about the formatting, not more than regular typographic formatting anyway (paragraphs, punctuation, etc.)

  • Engineers like code and dislike fuzzy things with hidden edge cases. When Slack tried to switch from markdown to a UI engineers complained loudly. There's no reason to believe it'd have been any different had one of the fifty other formats that existed won out.

    From a practical point of view, markdown you can diff, you can copy it around without edge cases, etc. There's a reason people use Latex rather than Word which is evident the first time you have to spend 30 minutes fighting Word's formatting edge cases.

Prefer Asciidoc over Markdown. Easier to remember syntax, richer set of controls and more importantly - better tables. Tables in Markdown are a nightmare. Asiidoc is a lifesaver for tables. For big tables, columns can be specified as rows leading to sane reading.

  • I like asciidoc because latex math is part of the standard and not something you may or may not get like with markdown. There are a couple of other nice things (like tables) too.

    I think the complexity of asciidoc is probably why there are only a couple implementations though, which is a real bummer.

  • IMO Asciidoc tables can get messy quite soon, to the point that at that point I'll just use HTML.

  • Multimarkdown tables are fairly straightforward, no? From a blog post about the linear typewriters last year, a simple table (unformatted because I'm lazy.)

        input file | my score | article score | ratio
        -----------|----------|---------------|------
        stripped.txt | 5262321 | 5499341 | 95.7%
        s2.txt | 5510008 | 5499341 | 100.2%
    

    vs

        .Table Scores
        |===
        | input file | my score | article score | ratio
    
        | stripped.txt
        | 5262321
        | 5499341
        | 95.7%
    
        | s2.txt
        | 5510008
        | 5499341
        | 100.2%
        |===
    

    And that's just two lines of four columns - I've got blog posts with 20 lines of 5 columns. It would be heartbreaking to type that in.

  • prefer asciidoc because its a well defined standard and is just as simple to use as markdown.

    and yes - tables are much nicer in asciidoc

Honestly, the single biggest problem with Markdown is that it's been implemented in so many different ways by so many different people. This is largely because a lot of corner cases were simply not defined.

When a website uses Markdown for input, it's extremely tough to be sure of how to format certain things, especially when multiple different syntaxes combine. (I'm thinking of things like list+code block; or bold+italics, or list+line break, etc.)

  • Markdown has similarities with CSV which is also underspecified and has varying implementations. Even simple things like quoting don't work consistently across applications.

    That has not stopped countless applications from using CSV because it's so useful to have a simple transfer format. CSV corner cases usually don't hurt too badly because you can usually rule them out based on the source or sink of the data.

And one thing that this article gets very wrong about Markdown: It was not designed to be a structured markup syntax of any sort. The syntax, in fact, was never actually designed as such.

Markdown originated as, basically, one guy's attempt to convert instances of unofficial text-formatting convention, commonly used in text-only communication such as emails and Usenet, into HTML that can be presented on the Web. In order to do that he had to enforce some standards, while still leaving some flexibility (which is why the top levels of headings have more than one forms; both were commonly used to designate the same thing).

It was only after companies like Stack Exchange and GitHub started using Markdown as, ironically, a markup language, that there were any attempts to enforce a standard. I clearly remember a whole online "war", for the lack of better term, arguing pros and cons of standardisation.

About the header slugs: I don't think the content of the header should be the anchor ID. Readers will link to it, the author will fix a spelling issue or whatever, and the link will break and other readers will arrive at the top of the page when the browser doesn't find the old ID.

My solution that works with most Markdown variants is to use inline HTML to define an anchor directly before (or after) the header.

  <a id="stuff"></a>
  ### Things about stuff

But of course this isn't much cleaner than just using inline HTML for the header itself.

  <h3 id="stuff">Things about stuff</h3>

  • as an author i'd say its rare to change headers, while its easy for the user to guess the right link if the id is broken.

    careful not to penalize the 99% usecase for a slightly better 1% usecase.

Shameless plug: I do 99.98% of my work in a terminal, and I'd much rather read markdown like a man page. I created a little utility to display Markdown (actually, any kind of file that `pandoc` can read) and displays it nicely formatted. You can see it here:

https://github.com/ashton314/marked-man

Hope someone might find it useful. :)

  • I use a variation of the shell-pipeline you use in your `mm` script. I don't know how to do code blocks here, so I'm pasting the un-commented one-liner

    mdv () { pandoc -s -t man ${1:-"-"} |groff -T utf8 -man | sed 1,4d | head -n -4 |${PAGER:-$(DN=/dev/null; which less &>$DN && { echo "less -FRSEX"; }|| which more 2>$DN || echo cat)} ; }

    from my answer here: https://stackoverflow.com/a/61029131/5208540

    This is a shell function to add to your environment

> Code Blocks probably aren't necessary

When you're reading raw markdown they're extremely useful for short snippets since they save you two wasted lines per block.

  • But these short snippets are useless because you can't easily copy paste them, especially with some programming languages that are whitespace-significant (looking at you, Python). And I don't understand how triple ticks are wasted lines, because you need to surround a code block with empty lines anyway while triple ticks work straight after line break.

    • If you're using your clipboard, it's invariably going to be at the wrong level of indentation regardless. What's the correct level of indentation? 0? 1? more? There's no generally correct answer, so you're going to have to use the block indent feature on your editor anyway.

Why do these articles always pretend to be so know-it-all? I really don't like the explicit statement that "Markdown got it wrong", leaving no room for debate. For everything that Gruber supposedly "got wrong", a lot of arguments can be made for why Gruber got it "right", the author often explains that himself. That means that a lot of this is subjective. Let's keep the facts and opinion separate.

  • > Markdown got it wrong

    You're right. The problem with the statement is that there's a presumption Markdown was created for the world at large, and I think it was originally designed for one "customer", who also happened to be the creator. Which means Markdown got it right.

  • It's also weird to say "this is wrong" about missing features that there are widely-used extensions for.

Hmmm,

* Markdown in HTML. NO!!! That would drive me crazy. I use HTML all the time for diagrams. If I had to be aware of every tiny place where my content in the diagram was going to get munged by the markdown parser I'd tear my hair out.

* * vs - I've used * my entire life for lists, decades before markdown existed. Never heard of using - for lists until yaml

* auto numbered lists. NO!

The problem is it's nearly impossible to keep your formatting right if you have a long item lists. I often write answers on s.o. where it's like "You have 3 issues. 1. several paragraphs 2. several more paragraphs and code samples. It would be absolute hell to have to figure out the formatting to make sure auto numbered item 2 stayed 2 and didn't become 1 in a new list

* code block indentation - agreed that code fences are generally better. Worse most places I enter markdown (stackoverflow) don't have any indenting/outdenting editor controls which makes it really painful. Either I had to manually indent 5 to 50 lines or else edit outside stack overflow and paste in. I know stackoverflow has Ctrl-K but it doesn't work once you take in account the list indentation issues.

* no syntax for adding classes

I don't mind that. It kind of feels like part of the point. If I need a class I use embedded html although that's another place to rant. Inline HTML uses markdown where as self contained HTML does not. I'd prefer they both didn't. In other words. If you put

    The big <span style="color: red">*bear*</span>

you'll get a italic red 'bear' but if you put

    <div>The big <span style="color: red">*bear*</span></div>

You'll get red '<asterisk>bear</asterisk>'

Ok, I give up. no idea how to put an asterisk on HN. >:(

As mentioned above I'd have preferred no markdown in HTML as it's bitten me quite often trying to color code variables in a math description that uses * and having the * get eaten by being parsed.

* Ids - The id thing does bug me too. Markdown generates ids based on headlines but that's way too brittle. Edit the headline and your ids break. I also wish auto-numbered footnotes were a standard feature that worked by id so I could do something like [-fn-](#someid) and later #fn-someid paragraph a it would insert [<num>] at the top and link to #fn-someid at the bottom.

  • > * auto numbered lists. NO!

    > The problem is it's nearly impossible to keep your formatting right if you have a long item lists. I often write answers on s.o. where it's like "You have 3 issues. 1. several paragraphs 2. several more paragraphs and code samples. It would be absolute hell to have to figure out the formatting to make sure auto numbered item 2 stayed 2 and didn't become 1 in a new list

    I think you and the author agree on this point: Markdown does have auto-numbered lists, and the author of the blog post believes that was a design mistake. If you write a Stack Overflow post with a list numbered 3, 2, 1, it will be rendered as 3, 4, 5.

Markdown drew its influences from predecessor structured text languages whose goal was minimum formatting of ASCII text in a structured way such that it could be parsed and re-rendered into any other format.

The goal was simplicity and readability in plain-text email and Usenet posts so I'm not surprised that fifteen years from its inception, and thirty or more years on from some its antecedents, we're using text and thinking about how we use text differently. One of Markdown's pivots from setext was the addition of code blocks. In the environment that setext was created, a code block would simply be rendered much like everything else: in the plain and probably fixed-width display that your email was also displayed in. By 2004, email had rich formatting, Usenet was on the wane, and people still needed a lightweight text format to share. Markdown filled that need, as did a number of other similar alternatives, though I forget their names.

My first exposure to such lightly-formatted ASCII was setext [1] which was the format chosen for the Mac- and Apple-oriented TidBITS email newsletter [2]. As a contributor to that newsletter, Gruber would have been intimately aware of the format.

Twenty years ago, I used setext in code projects basically the same way as nearly everyone uses Markdown now. I like OP's suggestions though I've always liked the 'lazy' approach to ordered lists.

[1] setext: https://en.wikipedia.org/wiki/Setext

[2] TidBITS: https://tidbits.com

i actually disagreed with a lot of these. markdown is opinionated. the feature set is limited. the output looks almost all about the same. this is the goal.

no one wants the myspace outcome of simplifying messages. you usually don’t need much styling but the tools that markdown includes are good enough. if you need more you probably want to ask why you’re writing in a place that only accepts markdown

  • > markdown is opinionated.

    This. If I were to guess, Gruber made Markdown to make writing for his blog easier. Then he made it available for whoever wanted to use it too, which was very nice of him.

    Given how there are already multiple flavors of Markdown available that address various shortcomings, I don't know if there's a point to critiquing Gruber's original version of Markdown since it was likely created to satisfy his writing requirements first and foremost.

  • So are you saying that you disagree with the current “hmtl as superset” version of markdown? Or are you saying that it’s good that he didn’t go all the way and support markdown-in-html-in-markdown because the halfway point makes it perfectly half usable for the point of having html in the first place?

    Markdown is great but there are some minor points where it fails spectacularly with only minor tweaks needed to improve it tremendously. Also code ticks at its point to most are just considered part of “standard” markdown.

You create empty lines by adding two spaces at the end of lines. MD has invisible syntax. To me this without doubt is the worst thing about MD.

Things it did right: It was the right tool at the right time. Everyone knows it and can use it. It's like Javascript for documentation.

> There's nothing I hate more than reading a good article with headers, wanting to link someone directly to the section relevant to them, and then not having an id to use to send them straight there.

Lots of other things I hate more than this, but dagnabit this drives me nuts too!

The second you want to do code and images on a page I just spend all my time fighting whatever markdown workflow is in place. Which would be fine if it added anything but they never just work, you still need to do your own image optimisation and aligning images or custom layout sections is terrible or not possible.

Markdown is harder to work with in the long term than just wiring HTML with Emmet. That's my opinion a few years ago and these days I'd double down on it as editors are so good.

Tables suck in everything so we should standardise a json and/or csv import language to drop tables in a built time anyway.

Markdown is good for notes. Even mermaid diagrams are kind of unpleasant to write.

  • Yes. At times I would like to be able to use italics in a code block to emphasize some portion of the code, and that seems very difficult, if not impossible, to do.

    I realize this would conflict with automatic syntax highlighting, which is probably why it is not allowed.

Some of those things are addressed in my favourite markdown variant - multimarkdown https://fletcher.github.io/MultiMarkdown-6/MMD_Users_Guide.h...

The author has found a sweet spot between being very conservative with the original declarations and adding the missing functionality. It also comes with an amazing lightweight implementation written in C that behaves correctly: gets input from stdin or a file and passes it to stdout, without the -o flags that (e.g.) pandoc is using.

I'm currently working on a markdown system where I wanted to extend codeblocks to be able to incorporate more information. The `js:title=example.js` just doesn't cut it. Want I landed on was some sort of yaml-ish format:

    ```
    ***
    property1: ...
    property2: ...
    ***
    code
    ```

This way multiline properties are supported and I can also incorporate markdown inside the properties themselves.

I was turned on to ReST by a coworker and prefer it over Markdown. Aesthetically ReST code seems easier to read in raw form; and it’s somewhat more predictable in what works and what doesn’t because bad reST code breaks.

reStructuredText on these, for anyone that’s curious or thinking about it:

1. If you use “5.”, it’ll give you a five. If you want auto-numbering, use “#.” as your prefix. (While I’m thinking of it: I hate that using actual Unicode bullets like “•” doesn’t work in Markdown, and that in https://talk.commonmark.org/t/unicode-character-bullet-u-202... it’s been actively rejected for CommonMark.)

2. Code blocks are by indentation only, no fencing, but avoids confusion about indentation levels by defining all of that stuff properly and having it match visual usage, rather than Markdown’s crazy “four spaces, but in certain situations we’ll let you get by with 3, 2, 1 or even 0—but if you then seek to nest, remember to bump it up to eight rather than just N+4”. Code blocks are indicated by `::` at the end of or as the entire previous paragraph, which is esoteric, but also fits in very well with its directive syntax, which is how you can attach extra metadata to code blocks. It’s definitely not as simple as the newer Markdown fencing approach, though it fits into the rest of reStructuredText in a sane and principled way so that it’s not at all hard. Still, I think that a new version of reStructuredText would quite possibly include fencing in some form—or maybe directives would be resyntaxed to be able to be fenced rather than indented, though achieving that while still allowing nesting would take care.

3. Markdown in HTML in Markdown? That Markdown extends HTML is the real problem here, and is both its greatest strength and help in adoption, and its catastrophic weakness. reStructuredText is a language of its own that doesn’t extend anything, so that style of HTML output is typically done by defining new directives, which can then contain reStructuredText. It’s strictly less powerful, but it composes way better. As a workaround, the `raw` directive lets you put in arbitrary HTML (or LaTeX, &c.).

4. By deliberately not being HTML, reStructuredText handles this sort of thing much better. Each node in the document can have classes (which will end up as class="…" in HTML, something else in LaTeX, &c.). You can do things like `.. class:: …` to apply a class to the next block (heading, paragraph, list, &c.). Most directives support a :class: option. You can define new roles so that :classname:`…` will do what you desire. And none of this is then tied to HTML.

5. reStructuredText lets you define anchors at any place in the document (which can also be used for index entries when writing books or things like that). `.. _anchor-name:` above the heading will do it thus. This protects you against the anchor changing (and thus breaking links) if you change the text of the heading. Any form of automatic heading ID generation is left to the software to decide, e.g. Sphinx generates IDs by default.

6. Field lists. A field list at the start of the document is treated as document bibliographic data, and you can put anything in there you like.

I myself prefer reStructuredText in almost all regards, but I scarcely use it any more other than for a couple of types of personal documents, because of the ubiquity of Markdown. It was similar with Mercurial and Git.

  • Agree. reStructuredText has much stronger concepts. Composable. A standard way to extend semantics without requiring new syntax. Not tied to HTML. It has some slightly more superficial weaknesses that made it the loser I guess: ugly / awkward / inconvenient syntax; no atx headers; not every document is "valid"; HTML is less integrated... worse-is-better strikes again.

  • It's weird how both ReST and Markdown stumbled on that indentation thing. And ReST has the awkward header underlines, too. Both of those are annoying at the best of times and basically unusable with proportional fonts (and we're dealing with prose here, where there should be no reason to force people to use monospaced fonts if they don't want to.)

    If I'm ever involved in designing a syntax like that, the first hard and fast rule will be that the user should never have to count anything, especially in relation to anything else.

    • reStructuredText is very committed to the plain text visually matching the formatted. For headings, visual underlining makes sense in that context. And code is conventionally indented, so it also makes a lot of sense. (Blockquotes are indented as well, differing from code blocks by the absence of `::` at the end of the preceding paragraph; indentation again matches conventional appearance. But that’s a place where I get where Markdown’s coming from, using > indentation like in emails.)

      In more recent times, convenience of writing has become a more important concern to people, because these formats have shifted from niche use by dedicated people in real text editors that would like what they see to match the end result fairly well, to mainstream use in textareas and similar, and sometimes even WYSIWYG editors. That’s what’s driven people to prefer the convenience of code fencing, because indenting each line in a textarea is a pain. Ditto on headings. If reStructuredText were being redone now, I think it’s fair to say that prefix rather than underlined headers would be at least an option. It would just remain to be seen whether they went with `###` meaning level 3 even if there was no `##` or `#`, or whether they’d boost it up to level 1 or title, as appropriate.

I can't stand the invisible syntax like a double trailing space is some valid syntax.

TL;DR - "Markdown is too simple for me, I'd like it to be more nuanced"

I'm not saying that to be cheeky. The article can be boiled down to that because the author is looking for HTML features in Markdown, when there's very little need. e.g. we don't need classes in markdown and we don't need IDs or `name` attribute specifiers because that can be accomplished already by mixing in the HTML that one needs.

I agree with some of this (i.e. the numbers), but I think it's coming from a place of not understanding what Markdown is for. For example, your complaint about indentation for code blocks makes no sense if you're using Markdown appropriately.

The original idea was to take emergent syntax used by text file authors as cues for humans reading text files, and use that as a syntax for generating HTML, so you could use text files as the source of truth for generating HTML.

Backticks aren't an emergent syntax used by text file authors as cues for humans reading text files--that's a syntax directly intended for generating code blocks in HTML, with little semantic value if the markdown is going to be consumed as text.

If your goal is to have something consumable as both text and HTML, then some sacrifices have to be made, because text simply can't do everything that HTML can. The error-prone indentation syntax is just a compromise, but there were features that were left out completely. The original markdown didn't contain underline or strikethrough, for example, because text can't do those things.

The problem is that for a few years, this idea turned out to be useful as a way of allowing users rich text editing capability on websites. So people wanted Markdown to do everything that HTML could do, hence we get underline, strikethrough, code fences, etc. This hurts use cases where markdown is used for its original purpose, because modern markdown isn't as nicely consumable as text files, but users didn't care because they weren't using Markdown to be consumable as text files. And this made sense for a while.

And where this becomes stupid is that if you don't care about it looking nice as text, then you shouldn't be using Markdown. There are a wealth tools for rich text editing on the web (TinyMCE being the most obvious choice, but not the only one available) and they're FAR superior in user-friendliness to Markdown. Many actually support Markdown, but there's less and less reason to even expose that functionality to your users. If you don't care about how your Markdown looks as text, then just use HTML and edit it with a rich text editor.

There's another use case where you want to compose HTML content without the overhead of actually writing HTML, but you want to "do it like a coder", so using a rich text editor isn't appropriate because you want to be able to diff and whatnot. But in that case, there's still no reason to be using markdown, because if you're "doing it like a coder" than you can just write code which makes what you're doing much clearer, and give you a lot more power. Textile, Org-Mode, LaTeX are all better options.

GitHub has persisted in using Markdown well past the advent of mature web text editors, because they still seem to be operating under the illusion that nontechnical users will become a significant part of their user base. Slack and Reddit are both moving toward fully rich text, and the fact that they are still trying to expose their Markdown roots just makes their rich text painful to use. MediaWiki using their markup language still, probably because they are blocked by the enormously difficult task of migrating an enormous amount of content, but if you're starting a new project, you don't have any of the tradeoffs that these organizations do, so you shouldn't make the mistake of following in their footsteps.

Kinda on-topic (50/50): I'm looking for a flavor of something-like-markdown, but it would have semantic tagging and some abilities to add my own attributes to an element/tag (like for links, I want to be able to specify some to open in another tab, but others to open in the same tab, I'm willing to do that with raw attributes if I can add them on) for conversion to HTML in a static generator, is there something like that anyone knows about? (I prefer not to "roll my own" if I can use a standards-based language)

For instance, I'd want to do the example in the GitHub flavored markdown with tags like <article> or <aside> around markdown-like syntax. Markdown is so easy compared to HTML, so I want that ease in my article-writing workflow.

I was considering using HTML directly in a templating language, but I want it to be a more human-friendly text format, like Markdown.

People put too much behind Markdown is what happened. It was just a random idea on a blog post. But people made it a universal authoring standard and it is just not up to the task.

Markdown has been a godsend.

It’s just what I wanted. Something text based, where I can sorta control some formatting, but isn’t tag heavy like HTML.

I just need it to format a few things, to make it easier for myself to consume later.

The most useful for me is actually the syntax highlighted code blocks.

I use it mostly for brain storming, planning, and scratch work.

What was even the motivation in the first place to not respect newlines?

  • I assume you are talking about the fact that adjacent lines are merged into a single paragraph.

    Have you written traditional plain text emails or write any plain text documents? You almost always hard wrap your lines for readability. Even in HTML, newlines are mostly equivalent to just spaces.

  • If the first word after the newline wouldn’t fit on the last line before the newline, you can’t see whether there is a newline.

    Also, since you don’t know window width, that can happen even if, on your terminal, the last line of a paragraph is only 3 characters and the first word of the next is “A”.

Hot take (maybe): GitHub flavored markdown is the ideal markup language. If the author’s ideas were actually better, we’d use them instead.

It supports code well, it supports non-enumerated lists well, it supports enumerated lists well (if you don’t want view it in rendered markdown, you can view the raw and just disregard the numbers, understanding that the items are supposed to be sequential). It supports links well. It supports images well. It’s easily readable, even with no prior context (@toml, yaml, etc.).

It does whatever a static website could want, and it does it all decently well.

  • People tend to use whatever they have to use to be on a platform since these different parsers aren't portable. Everyone is on GH for the code hosting, getting a markdown editor that works is a bonus. If you go to any other community that uses a different markdown standard or even something that's not markdown, people will be using that - not because it's better than GH flavored markdown, but because that's what's available.

    • Github supports reStructuredText just fine. You can even display images in Readme without any extension, just a plain image directive.

    • Not true in the slightest. I used for periods of my life StackOverflow, Jira/Confluence, and bbcode exclusively and hated interacting with each of their markup languages.

  • Yes the author praises GitHub flavored markdown. The criticisms presented are on gaps in the original markdown design, not implementations.

    • author here. yup. tried to make that clear by constantly referring to Gruber's spec. A few years ago I didn't even know there were different flavors of markdown. Fortunately/Unfortunately Gruber's is the one that stuck and is universally accepted by all markdown tooling, hence it is now the lowest common denominator we must work with.

      1 reply →

  • I think we're using it because GitHub renders GitHub flavoured markdown well.

    Thankfully it also renders Org files pretty well, which AFAIK supports everything listed and more.