11tree-sitter-vimdoc
22==================
33
4- This grammar intentionally support a subset of the vimdoc "spec"; predictable
5- results are the primary goal, so that _ output_ formats (e.g. HTML) are
6- well-formed; the _ input_ (vimdoc) is secondary. The first step should always be
7- to try to fix the input (within reason) rather than insist on a grammar that
8- handles vimdoc's endless quirks.
4+ This grammar intentionally support a subset of the vimdoc "spec"
5+ ([ ref1] ( https://neovim.io/doc/user/helphelp.html#help-writing ) ,
6+ [ ref2] ( https://github.com/nanotee/vimdoc-notes ) );
7+ predictable results are the primary goal, so that _ output_ formats (e.g. HTML)
8+ are well-formed; the _ input_ (vimdoc) is secondary. The first step should
9+ always be to try to fix the input rather than insist on a grammar that handles
10+ vimdoc's endless quirks.
911
1012Overview
1113--------
1214
13- - vimdoc format "spec":
14- - [ : help help-writing] ( https://neovim.io/doc/user/helphelp.html#help-writing )
15- - https://github.com/nanotee/vimdoc-notes
16- - whitespace is intentionally captured in all atoms, because it is often used
17- for "layout" and ascii art in legacy help files.
18- - ` block ` is the main top-level node which contains ` line ` nodes.
19- - ends at blank line(s) or a line starting with ` < ` .
15+ - ` block ` is the main top-level node which contains ` line ` and ` line_li ` nodes.
16+ - delimited by blank line(s) or any line starting with ` < ` (codeblock terminator).
2017- ` line ` :
2118 - contains atoms (words, tags, taglinks, …)
22- - contains ` codeblock ` because ` > ` can start a codeblock at the end of a line.
2319 - contains headings (` h1 ` , ` h2 ` , ` h3 ` ) because ` codeblock ` terminated by
2420 "implicit stop" (no terminating ` < ` ) consumes blank lines, so ` block ` has
2521 no way to end.
2622 - contains ` column_heading ` because ` < ` (the ` codeblock ` terminating char)
2723 can appear at the start of ` column_heading ` .
24+ - ` line_li ` ("list item")
25+ - consumes lines until blank line, codeblock, or next listitem.
26+ - nesting is ignored: indented listitems are parsed as siblings.
2827- ` codeblock ` :
29- - contains ` line ` nodes which do not contain ` word ` nodes, it's just the full
28+ - contained by ` line ` or ` line_li ` . Because ">" can start
29+ a codeblock at the end of any line.
30+ - contains ` line ` nodes without ` word ` nodes, it's just the full
3031 raw text line including whitespace. This is somewhat dictated by its
3132 "preformatted" nature; parsing the contents would require loading a "child"
3233 language (injection). See [ #2 ] ( https://github.com/neovim/tree-sitter-vimdoc/issues/2 ) .
@@ -38,16 +39,20 @@ Overview
3839Known issues
3940------------
4041
41- - ` line_li ` ("list item") is experimental. It doesn't support nesting yet.
42+ - Input must end with newline/EOL (` \n ` ). Grammar does not support files without EOL.
43+ - Input must end with a blank line. Though this doesn't seem to matter in practice.
4244- Spec requires that ` codeblock ` delimiter ">" must be preceded by a space
4345 (" >"), not a tab. But currently the grammar doesn't enforce this. Example:
4446 ` :help lcs-tab ` .
4547- ` url ` doesn't handle _ surrounding_ parens. E.g. ` (https://example.com/#yay) ` yields ` word `
4648- ` url ` doesn't handle _ nested_ parens. E.g. ` (https://example.com/(foo)#yay) `
49+ - ` column_heading ` currently only recognizes tilde "~ " preceded by space (i.e.
50+ "foo ~ " not "foo~ "). This covers 99% of : help files, but the grammar should
51+ probably support "foo~ " also.
4752
4853TODO
4954----
5055
51- - ` line_noeol ` is a special-case to support documents that don't end in EOL.
52- Grammar could be simpler if we require EOL at end of document.
5356- ` line_modeline ` ?
57+ - ` tag_heading ` : line(s) containing only tags, typically implies a "heading"
58+ before a block.
0 commit comments