Skip to content

Commit 09edd66

Browse files
authored
👌 IMPROVE: MarkdownIt config and documentation (#136)
Add additional configuration presets, allow for options to be overridden in the `MarkdownIt` initialisation, Add convenience methods to `SyntaxTreeNode`.
1 parent a70db2a commit 09edd66

File tree

14 files changed

+275
-46
lines changed

14 files changed

+275
-46
lines changed

.pre-commit-config.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ exclude: >
77
test.*\.md|
88
test.*\.txt|
99
test.*\.html|
10+
test.*\.xml|
1011
.*commonmark\.json|
1112
benchmark/.*\.md|
1213
.*/spec\.md
@@ -31,6 +32,7 @@ repos:
3132
rev: 3.8.4
3233
hooks:
3334
- id: flake8
35+
additional_dependencies: [flake8-bugbear==21.3.1]
3436

3537
- repo: https://github.com/psf/black
3638
rev: 20.8b1

.readthedocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ python:
66
- method: pip
77
path: .
88
extra_requirements:
9+
- linkify
910
- rtd
1011

1112
sphinx:

docs/conf.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,13 @@
4545
# This pattern also affects html_static_path and html_extra_path.
4646
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
4747

48-
nitpick_ignore = [("py:class", "Match"), ("py:class", "x in the interval [0, 1).")]
48+
nitpick_ignore = [
49+
("py:class", "Match"),
50+
("py:class", "x in the interval [0, 1)."),
51+
("py:class", "markdown_it.helpers.parse_link_destination._Result"),
52+
("py:class", "markdown_it.helpers.parse_link_title._Result"),
53+
("py:class", "MarkdownIt"),
54+
]
4955

5056

5157
# -- Options for HTML output -------------------------------------------------

docs/using.md

Lines changed: 95 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ then these are converted to other formats using 'renderers'.
2828
The simplest way to understand how text will be parsed is using:
2929

3030
```{code-cell}
31+
from pprint import pprint
3132
from markdown_it import MarkdownIt
3233
```
3334

@@ -48,8 +49,15 @@ for token in md.parse("some *text*"):
4849

4950
The `MarkdownIt` class is instantiated with parsing configuration options,
5051
dictating the syntax rules and additional options for the parser and renderer.
51-
You can define this configuration *via* a preset name (`'zero'`, `'commonmark'` or `'default'`),
52-
or by directly supplying a dictionary.
52+
You can define this configuration *via* directly supplying a dictionary or a preset name:
53+
54+
- `zero`: This configures the minimum components to parse text (i.e. just paragraphs and text)
55+
- `commonmark` (default): This configures the parser to strictly comply with the [CommonMark specification](http://spec.commonmark.org/).
56+
- `js-default`: This is the default in the JavaScript version.
57+
Compared to `commonmark`, it disables HTML parsing and enables the table and strikethrough components.
58+
- `gfm-like`: This configures the parser to approximately comply with the [GitHub Flavored Markdown specification](https://github.github.com/gfm/).
59+
Compared to `commonmark`, it enables the table, strikethrough and linkify components.
60+
**Important**, to use this configuration you must have `linkify-it-py` installed.
5361

5462
```{code-cell}
5563
from markdown_it.presets import zero
@@ -61,18 +69,26 @@ md = MarkdownIt("zero")
6169
md.options
6270
```
6371

72+
You can also override specific options:
73+
6474
```{code-cell}
65-
print(md.get_active_rules())
75+
md = MarkdownIt("zero", {"maxNesting": 99})
76+
md.options
6677
```
6778

6879
```{code-cell}
69-
print(md.get_all_rules())
80+
pprint(md.get_active_rules())
7081
```
7182

7283
You can find all the parsing rules in the source code:
7384
`parser_core.py`, `parser_block.py`,
7485
`parser_inline.py`.
75-
Any of the parsing rules can be enabled/disabled, and these methods are chainable:
86+
87+
```{code-cell}
88+
pprint(md.get_all_rules())
89+
```
90+
91+
Any of the parsing rules can be enabled/disabled, and these methods are "chainable":
7692

7793
```{code-cell}
7894
md.render("- __*emphasise this*__")
@@ -97,6 +113,50 @@ Additionally `renderInline` runs the parser with all block syntax rules disabled
97113
md.renderInline("__*emphasise this*__")
98114
```
99115

116+
### Typographic components
117+
118+
The `smartquotes` and `replacements` components are intended to improve typography:
119+
120+
`smartquotes` will convert basic quote marks to their opening and closing variants:
121+
122+
- 'single quotes' -> ‘single quotes’
123+
- "double quotes" -> “double quotes”
124+
125+
`replacements` will replace particular text constructs:
126+
127+
- ``(c)``, ``(C)`` → ©
128+
- ``(tm)``, ``(TM)`` → ™
129+
- ``(r)``, ``(R)`` → ®
130+
- ``(p)``, ``(P)`` → §
131+
- ``+-`` → ±
132+
- ``...`` → …
133+
- ``?....`` → ?..
134+
- ``!....`` → !..
135+
- ``????????`` → ???
136+
- ``!!!!!`` → !!!
137+
- ``,,,`` → ,
138+
- ``--`` → &ndash
139+
- ``---`` → &mdash
140+
141+
Both of these components require typography to be turned on, as well as the components enabled:
142+
143+
```{code-cell}
144+
md = MarkdownIt("commonmark", {"typographer": True})
145+
md.enable(["replacements", "smartquotes"])
146+
md.render("'single quotes' (c)")
147+
```
148+
149+
### Linkify
150+
151+
The `linkify` component requires that [linkify-it-py](https://github.com/tsutsu3/linkify-it-py) be installed (e.g. *via* `pip install markdown-it-py[linkify]`).
152+
This allows URI autolinks to be identified, without the need for enclosing in `<>` brackets:
153+
154+
```{code-cell}
155+
md = MarkdownIt("commonmark", {"linkify": True})
156+
md.enable(["linkify"])
157+
md.render("github.com")
158+
```
159+
100160
### Plugins load
101161

102162
Plugins load collections of additional syntax rules and render methods into the parser
@@ -130,7 +190,6 @@ md.render(text)
130190

131191
## The Token Stream
132192

133-
134193
+++
135194

136195
Before rendering, the text is parsed to a flat token stream of block level syntax elements, with nesting defined by opening (1) and closing (-1) attributes:
@@ -183,20 +242,42 @@ This dictionary can also be deserialized:
183242
Token.from_dict(tokens[1].as_dict())
184243
```
185244

186-
In some use cases `nest_tokens` may be useful, to collapse the opening/closing tokens into single tokens:
245+
### Creating a syntax tree
246+
247+
```{versionchanged} 0.7.0
248+
`nest_tokens` and `NestedTokens` are deprecated and replaced by `SyntaxTreeNode`.
249+
```
250+
251+
In some use cases it may be useful to convert the token stream into a syntax tree,
252+
with opening/closing tokens collapsed into a single token that contains children.
187253

188254
```{code-cell}
189-
from markdown_it.token import nest_tokens
190-
nested_tokens = nest_tokens(tokens)
191-
[t.type for t in nested_tokens]
255+
from markdown_it.tree import SyntaxTreeNode
256+
257+
md = MarkdownIt("commonmark")
258+
tokens = md.parse("""
259+
# Header
260+
261+
Here's some text and an image ![title](image.png)
262+
263+
1. a **list**
264+
265+
> a *quote*
266+
""")
267+
268+
node = SyntaxTreeNode.from_tokens(tokens)
269+
print(node.pretty(indent=2, show_text=True))
192270
```
193271

194-
This introduces a single additional class `NestedTokens`,
195-
containing an `opening`, `closing` and `children`, which can be a list of mixed
196-
`Token` and `NestedTokens`.
272+
You can then use methods to traverse the tree
273+
274+
```{code-cell}
275+
node.children
276+
```
197277

198278
```{code-cell}
199-
nested_tokens[0]
279+
print(node[0])
280+
node[0].next_sibling
200281
```
201282

202283
## Renderers

markdown_it/main.py

Lines changed: 46 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -27,36 +27,50 @@
2727
linkify_it = None
2828

2929

30-
_PRESETS = AttrDict(
31-
{
32-
"default": presets.default.make(),
33-
"zero": presets.zero.make(),
34-
"commonmark": presets.commonmark.make(),
35-
}
36-
)
30+
_PRESETS = {
31+
"default": presets.default.make(),
32+
"js-default": presets.js_default.make(),
33+
"zero": presets.zero.make(),
34+
"commonmark": presets.commonmark.make(),
35+
"gfm-like": presets.gfm_like.make(),
36+
}
3737

3838

3939
class MarkdownIt:
4040
def __init__(
41-
self, config: Union[str, Mapping] = "commonmark", renderer_cls=RendererHTML
41+
self,
42+
config: Union[str, Mapping] = "commonmark",
43+
options_update: Optional[Mapping] = None,
44+
*,
45+
renderer_cls=RendererHTML,
4246
):
4347
"""Main parser class
4448
4549
:param config: name of configuration to load or a pre-defined dictionary
50+
:param options_update: dictionary that will be merged into ``config["options"]``
4651
:param renderer_cls: the class to load as the renderer:
4752
``self.renderer = renderer_cls(self)
4853
"""
54+
# add modules
55+
self.utils = utils
56+
self.helpers: Any = helpers
57+
58+
# initialise classes
4959
self.inline = ParserInline()
5060
self.block = ParserBlock()
5161
self.core = ParserCore()
5262
self.renderer = renderer_cls(self)
63+
self.linkify = linkify_it.LinkifyIt() if linkify_it else None
5364

54-
self.utils = utils
55-
self.helpers: Any = helpers
65+
# set the configuration
66+
if options_update and not isinstance(options_update, Mapping):
67+
# catch signature change where renderer_cls was not used as a key-word
68+
raise TypeError(
69+
f"options_update should be a mapping: {options_update}"
70+
"\n(Perhaps you intended this to be the renderer_cls?)"
71+
)
5672
self.options = AttrDict()
57-
self.configure(config)
58-
59-
self.linkify = linkify_it.LinkifyIt() if linkify_it else None
73+
self.configure(config, options_update=options_update)
6074

6175
def __repr__(self) -> str:
6276
return f"{self.__class__.__module__}.{self.__class__.__name__}()"
@@ -79,7 +93,9 @@ def set(self, options: AttrDict) -> None:
7993
"""
8094
self.options = options
8195

82-
def configure(self, presets: Union[str, Mapping]) -> "MarkdownIt":
96+
def configure(
97+
self, presets: Union[str, Mapping], options_update: Optional[Mapping] = None
98+
) -> "MarkdownIt":
8399
"""Batch load of all options and component settings.
84100
This is an internal method, and you probably will not need it.
85101
But if you will - see available presets and data structure
@@ -89,21 +105,24 @@ def configure(self, presets: Union[str, Mapping]) -> "MarkdownIt":
89105
That will give better compatibility with next versions.
90106
"""
91107
if isinstance(presets, str):
92-
presetName = presets
93-
presets = _PRESETS.get(presetName, None)
94-
if not presets:
95-
raise KeyError(
96-
'Wrong `markdown-it` preset "' + presetName + '", check name'
97-
)
98-
if not presets:
99-
raise ValueError("Wrong `markdown-it` preset, can't be empty")
100-
config = AttrDict(presets)
101-
102-
if "options" in config:
103-
self.set(config.options)
108+
if presets not in _PRESETS:
109+
raise KeyError(f"Wrong `markdown-it` preset '{presets}', check name")
110+
config = _PRESETS[presets]
111+
else:
112+
config = presets
113+
114+
if not config:
115+
raise ValueError("Wrong `markdown-it` config, can't be empty")
116+
117+
options = config.get("options", {}) or {}
118+
if options_update:
119+
options = {**options, **options_update}
120+
121+
if options:
122+
self.set(AttrDict(options))
104123

105124
if "components" in config:
106-
for name, component in config.components.items():
125+
for name, component in config["components"].items():
107126
rules = component.get("rules", None)
108127
if rules:
109128
self[name].ruler.enableOnly(rules)

markdown_it/port.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,11 @@
2323
In markdown_it/rules_block/reference.py,
2424
record line range in state.env["references"] and add state.env["duplicate_refs"]
2525
This is to allow renderers to report on issues regarding references
26+
- |
27+
The `MarkdownIt.__init__` signature is slightly different for updating options,
28+
since you must always specify the config first, e.g.
29+
use `MarkdownIt("commonmark", {"html": False})` instead of `MarkdownIt({"html": False})`
30+
- The default configuration preset for `MarkdownIt` is "commonmark" not "default"
2631
- Allow custom renderer to be passed to `MarkdownIt`
2732
- |
2833
change render method signatures

markdown_it/presets/__init__.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,25 @@
11
from . import commonmark, default, zero # noqa: F401
2+
3+
js_default = default
4+
5+
6+
class gfm_like:
7+
"""GitHub Flavoured Markdown (GFM) like.
8+
9+
This adds the linkify, table and strikethrough components to CommmonMark.
10+
11+
Note, it lacks task-list items and raw HTML filtering,
12+
to meet the the full GFM specification
13+
(see https://github.github.com/gfm/#autolinks-extension-).
14+
"""
15+
16+
@staticmethod
17+
def make():
18+
config = commonmark.make()
19+
config["components"]["core"]["rules"].append("linkify")
20+
config["components"]["block"]["rules"].append("table")
21+
config["components"]["inline"]["rules"].append("strikethrough")
22+
config["components"]["inline"]["rules2"].append("strikethrough")
23+
config["options"]["linkify"] = True
24+
config["options"]["html"] = True
25+
return config

markdown_it/presets/commonmark.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,11 @@
1-
"""Commonmark default options."""
1+
"""Commonmark default options.
2+
3+
This differs to presets.default,
4+
primarily in that it allows HTML and does not enable components:
5+
6+
- block: table
7+
- inline: strikethrough
8+
"""
29

310

411
def make():

markdown_it/renderer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ def strong_open(self, tokens, idx, options, env):
3434
def strong_close(self, tokens, idx, options, env):
3535
return '</b>'
3636
37-
md = MarkdownIt(renderer=CustomRenderer)
37+
md = MarkdownIt(renderer_cls=CustomRenderer)
3838
3939
result = md.render(...)
4040

0 commit comments

Comments
 (0)