Comment by ordu

1 day ago

So many good words, but they all miss the crucial point: you can't write a parser for org-mode. So elisp interpreted is needed to run the lisp code that defines it. It means that org-mode can be good while you are using it from emacs, and it sucks for anything else.

I use markdown now, because you have a lot of tools to deal with markdown, while all tools for org-mode are bound to emacs. Which is perfectly fits the emacs philosophy of emacs being an operating system, but it is not for me. It was fun 20 years ago, but now when I'm thinking of tinkering with emacs configuration for hours to get anything done, I feel an impulse to run away.

There are incomplete parsers that cover most of the Org basics. For example, GitHub has one, crafted in Ruby. They use it to render e.g. readme.org files in repositories. It works quite well. I find the Org format very pleasant to work with.

I think the trick with Emacs and Org is to stick to the basics and then only add features or change your configuration very slowly, as needed. I have been using Emacs non-stop for >20 years and my .emacs is just 20 LOC. It's been shrinking, not growing. My goal is to bring it down towards 0 LOC. I have committed a few things upstream to modernize defaults.

Personally, I think the reputation of Org, Emacs, or Nix being hard and complex is undeserved. It's rather a documentation problem. There's no simple documentation to onboard newcomers and show them the basics in a principled way. So it looks like a mess, but it isn't.

  • > I have been using Emacs non-stop for >20 years and my .emacs is just 20 LOC. It's been shrinking, not growing.

    Me too. I mean I'm using Emacs too, and it is 20+ years. I hate it deeply, and I cannot stop hating it because I cannot get away from it. I regret deeply choosing emacs 20+ years ago and spending 20 years to wrap my habits around it.

    BTW my .emacs is still growing. I don't know how you manage to have 20 LoC of .emacs, I have a directory .emacs.d and a couple of dozen of files there. They are not large, some of them can be as small as 1 line. The last one I've wrote was dealing with indent of lua code. lua-ts-mode have some relatively simple rules that mostly work, but I was not happy with the result, there are some quirks that just are very inconvenient, and in some cases lua-ts-mode just fail to indent properly. So I fixed them to my taste. This one file is longer than 20 LoC.

    Though, I should note, that LLMs make this much simpler. It is very simple to reverse-engineer what there is, and if you can explain the idea how to change the code, LLM can write all the elisp needed. It doesn't work out of the box, of course, and needs to be debugged, still LLM can save an hour or two.

    > My goal is to bring it down towards 0 LOC.

    You cannot. If you use lua you just cannot, because lua-mode uses indent of 3 spaces. Not 2, not 4, but three. So any lua sources you can find on github and try to edit will not be indented like lua-mode does. I cannot imagine what was going on the mind of the person who had chosen this value. The only possible explanation I have is something like "I want to be not like the others", but it doesn't seem right.

    So you need at least to change lua-indent-offset (or lua-ts-indent-offset if you use treesitter), and it will be more than 0 LoC.

    • Thou shalt indent to three, no more, no less. Three shall be the number thou shalt indent, and the number of the indentation shall be three. Four shalt thou not indent, neither indent thou two, excepting that thou then proceed to three. Five is right out. Once the number three, being the third number, be reached, then compose thou thy Holy Lua.

    • :)

      You could customize that variable instead, so that Emacs manages everything for you and you don't have to care that it would look like a line of code if you opened your settings and edited them directly.

  • Care to share those 20 lines? Because I feel like every time I pick up a new language I need to add 5-10 lines to add some basic hooks and configs

This is incorrect. You can write a parser for org. See for example https://github.com/tgbugs/laundry. Work toward standardization has been stalled because I (among others) have not had time to circle back to work on it. In part this is because the lack of a standard has not blocked most use cases since emacs is open source and can run almost anywhere.

  • > You can write a parser for org. See for example https://github.com/tgbugs/laundry.

    Oh, there are a lot of incomplete parsers. This one is not an exception:

    > Status

    > Laundry can parse most of Org syntax, though there are still issues with the correctness of the parse in a number of cases.

    > In particular there are a number of edge cases in the interaction between the syntax for various Org objects that have not been resolved.

    I have my own parser as a pest grammar. It has just the basic features. This Laundry seems to implement more of org-mode, but I don't care anymore really, because I believe that org-mode will not be reimplemented.

    > In part this is because the lack of a standard has not blocked most use cases since emacs is open source and can run almost anywhere.

    I have some inexplicable aversion to an idea starting elisp interpreter just because my program needs org-mode parser. But even if I could integrate elisp into my program as easy as I do with lua, I probably wouldn't do it, because parser in lisp doesn't really solves the problem, it simplifies it a bit (I don't need to deal with the grammar) but shifts to another level: I need to learn how org-mode is represented as a lisp object. I need to reverse engineer the formal definition of that recursive object to deal with it, or turn on defensive programming expecting anything.

    The only realistic way of dealing with org mode is to write code for emacs. There are exceptions of this rule, like pandoc, but I don't trust them.

  • > Work toward standardization has been stalled because I (among others) have not had time to circle back to work on it.

    I tried to not to react to this, but, I'm sorry, I'm too much of a troll to just leave it without commenting.

    Of course you have no time to write a formal definition. No one has time for that, and no one will have time for this. Because at this stage it is practically impossible. The parser was written as a bunch of regexps intermixed with lisp code. All edge cases were baked into org-mode because those regexps are the definition of org-mode. To write a formal grammar you need to catch all those edge cases, and to reproduce the behavior of the existing parser.

    In retrospect, the parser should've been replaced with a formal grammar definition at much earlier stage, when it was possible to replace parser with another one, which is similar but generally incompatible because it deals with edge cases in a different ways. When the time was missed those edge-cases became a legacy you cannot fix.

Why can't a parser be written? Is there a halting problem or a grammar conflict? Or is "can't" short-hand for "too much trouble"?

  • > Why can't a parser be written?

    Because the existing parser is written in truly emacs style: no formal grammar, just a lisp code with a regexp at each turn. Theoretically speaking it doesn't forbid you from writing a parser, but in practice there are no full-blown parsers of org-mode except the reference one.

  • For many software businesses, licensing is an issue. The spec is GFDL with GPL code samples, a non-cleanroom translation of the elisp parser would (likely) be GPL (or at least arguably enough so to keep lawyers busy), so going and doing some other roughly equivalent markup language instead avoids the copyleft requirements.

    So, yes, “too much trouble”, much of it nontechnical.

I also don't use org-mode anymore, but sometimes I really do miss org-babel-tangle. In contexts where doctests aren't available it can be really helpful for making sure code listings actually work.