← Back to context

Comment by b0a04gl

1 day ago

vim’s syntax engine doesn’t track context. it matches tokens, not structure. in langs like ysh where command and expression modes mix mid-line, this breaks. no memory of nesting, no awareness of why you’re in a mode. one bad match and sync collapses. it’s not about regex power or file size. the engine just isn’t built to follow structure. stop layering hacks. generate semantic tokens outside, let vim just render them.

no memory of nesting

It absolutely nests! Vim's model has recursion, and it works perfectly. Some details here:

https://github.com/oils-for-unix/oils.vim/blob/main/doc/stag...

This highlighter is extremely accurate, and I would call it correct. I list about 3 known issues here, and they are all fixable/expressible in Vim's model:

https://github.com/oils-for-unix/oils.vim/blob/main/doc/algo...

Please install it, and file bugs with any inaccuracies. If YSH code is valid, it should not be mis-highlighted. There is test data in false-postive.ysh and false-negative.ysh.

Try to break it!

---

There are lots of Vim/Textmate plugins that are buggy, but it doesn't mean that all such plugins are.

generate semantic tokens outside

I'd also say that this doesn't really help, since I believe Tree-sitter is the most common way of doing that. I show at the top of the doc that Tree-sitter has issues in practice expressing shell (although admittedly it's not a fair comparison to YSH in Vim. Shell in Vim will have more problems, although in practice I find it pretty good)

  • Nested is also demonstrated by stage 2 fixing the "nested double quotes bug". Screenshots:

    https://github.com/oils-for-unix/oils.vim/blob/main/doc/stag...

    Stage 1 is non-recursive, but stage 2 is recursive.

    • fair, recursive groups exist, and yeah stage 2’s structure is solid. but the point was less about recursion as a feature and more about context awareness. vim’s engine lets you nest, sure, but it doesnt preserve intent across transitions. you can recurse into quoted strings, command subs, etc, but you can’t reflect on why you entered a state. there's no semantic trace. take ysh: command vs expression isn’t just syntactic, it shifts meaning of the same tokens. `[` in one context is an index, in another it’s test. vim can match both, but it can’t decide which meaning is active unless the outer mode is remembered. and that’s the gap

      tbh the plugin is impressive, no question. but that memoryless model will always need compromises, rule layering, and finetuned sync tricks. treesitter has its issues too, agreed. but having typed nodes and scope trees gives a baseline advantage when meaning depends on ancestry.

      2 replies →