@@ -7,48 +7,46 @@ well-formed; the _input_ (vimdoc) is secondary. The first step should always be
77to try to fix the input (within reason) rather than insist on a grammar that
88handles vimdoc's endless quirks.
99
10- Notes
11- -----
10+ Overview
11+ --------
1212
1313- vimdoc format "spec":
1414 - [ : help help-writing] ( https://neovim.io/doc/user/helphelp.html#help-writing )
1515 - https://github.com/nanotee/vimdoc-notes
16- - whitespace is intentionally captured in ` (word) ` , because it is often necessary to be
17- able to correctly layout vim help files (especially old/legacy).
18- - ` (codeblock) ` is contained by ` (line) ` because ` > ` can start a code block at the end of a line.
19- - ` (column_heading) ` is contained by ` (line) ` because ` > ` (to close
20- a ` (codeblock) ` can appear at the start of ` (column_heading) ` .
21- - ` h1 ` ("Heading 1"): ` ====== ` followed by text and optional ` *tags* ` .
22- - ` h2 ` ("Heading 2"): ` ------ ` followed by text and optional ` *tags* ` .
23- - ` h3 ` ("Heading 3"): only UPPERCASE WORDS, followed by optional ` *tags* ` .
16+ - whitespace is intentionally captured in all atoms, because it is often used
17+ for "layout" and ascii art in legacy help files.
18+ - ` block ` is the main top-level node which contains ` line ` nodes.
19+ - ends at blank line(s) or a line starting with ` < ` .
20+ - ` line ` :
21+ - contains atoms (words, tags, taglinks, …).
22+ - contains ` codeblock ` because ` > ` can start a codeblock at the end of a line.
23+ - contains ` column_heading ` because ` < ` (the ` codeblock ` terminating char)
24+ can appear at the start of a ` column_heading ` .
25+ - ` codeblock ` :
26+ - contains ` line ` nodes which do not contain ` word ` nodes, it's just the full
27+ raw text line including whitespace. This is somewhat dictated by its
28+ "preformatted" nature; parsing the contents would require loading a "child"
29+ language (injection). See [ #2 ] ( https://github.com/neovim/tree-sitter-vimdoc/issues/2 ) .
30+ - the terminating ` < ` (and any following whitespace) is discarded (anonymous).
31+ - ` h1 ` = "Heading 1": ` ====== ` followed by text and optional ` *tags* ` .
32+ - ` h2 ` = "Heading 2": ` ------ ` followed by text and optional ` *tags* ` .
33+ - ` h3 ` = "Heading 3": only UPPERCASE WORDS, followed by optional ` *tags* ` .
2434
2535Known issues
2636------------
2737
28- - ` line_li ` ("list item") is _ experimental_ . It doesn't support nesting yet and
29- it may not work well; you can treat it as a normal ` line ` for layout purposes.
30- - ` codeblock ` ">" must not be preceded only by tabs, a space char is required (" >").
31- See ` :help lcs-tab ` for example. Currently the grammar doesn't enforce this.
32- - ` codeblock ` terminated by an "implicit stop" (i.e. no terminating ` < ` )
33- consumes the first char of the terminating line, and continues the parent
34- ` block ` , preventing top-level forms like ` h1 ` , ` h2 ` from being recognized
35- until a blank line is encountered.
36- - ` line ` in a ` codeblock ` does not contain ` word ` atoms, it's just the full
37- raw text line including whitespace. This is somewhat dictated by its
38- "preformatted" nature; parsing the contents would require loading a "child"
39- language (injection). See [ #2 ] ( https://github.com/vigoux/tree-sitter-vimdoc/issues/2 ) .
38+ - ` line_li ` ("list item") is experimental. It doesn't support nesting yet.
39+ - Spec requires that ` codeblock ` delimiter ">" must be preceded by a space
40+ (" >"), not a tab. But currently the grammar doesn't enforce this. Example:
41+ ` :help lcs-tab ` .
42+ - ` codeblock ` terminated by an "implicit stop" (no terminating ` < ` ) consumes
43+ blank lines, preventing top-level forms like ` h1 ` from being recognized.
4044- ` url ` doesn't handle _ surrounding_ parens. E.g. ` (https://example.com/#yay) ` yields ` word `
4145- ` url ` doesn't handle _ nested_ parens. E.g. ` (https://example.com/(foo)#yay) `
42- - Ideally ` block_end ` should consume the last block of the document _ only_ if that
43- block is missing a trailing blank line or EOL ("\n").
44- - TODO: consider simply _ not supporting_ docs without EOL?
45- - Ideally ` line_noeol ` should consume the last line of the document _ only_ if
46- that line is missing EOL ("\n").
47- - TODO: consider simply _ not supporting_ docs without EOL?
4846
4947TODO
5048----
5149
5250- ` line_noeol ` is a special-case to support documents that don't end in EOL.
53- Grammar could be a bit simpler if we just require EOL at end of document.
54- - ` line_modeline ` (only at EOF)
51+ Grammar could be simpler if we require EOL at end of document.
52+ - ` line_modeline ` ?
0 commit comments