Skip to content

Commit bf512ab

Browse files
authored
Update dependencies and README (#6)
1 parent 0f7abed commit bf512ab

File tree

13 files changed

+129
-127
lines changed

13 files changed

+129
-127
lines changed
Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,35 @@
1+
# it's called "Build" because this dictates the badge name
12
name: Build
23
on:
34
push:
45
branches:
56
- master
7+
- workflow_check
68
pull_request:
79
branches:
810
- master
911
jobs:
10-
lint:
11-
name: lint
12+
check:
13+
name: Check
1214
runs-on: ubuntu-latest
1315
strategy:
1416
matrix:
15-
python-version: [3.7, 3.8]
17+
python-version: ['3.7', '3.8', '3.9']
1618
steps:
17-
- uses: actions/checkout@v2
1819
- name: Set up Python ${{ matrix.python-version }}
19-
uses: actions/setup-python@v1
20+
uses: actions/setup-python@v2
2021
with:
2122
python-version: ${{ matrix.python-version }}
23+
24+
- name: Checkout repository
25+
uses: actions/checkout@v2
26+
2227
- name: Install dependencies
2328
run: |
24-
python -m pip install --upgrade pip
25-
pip install poetry pre-commit
29+
python -m pip install --upgrade pip poetry pre-commit
2630
poetry install
31+
2732
- name: Run checks
2833
run: |
2934
pre-commit run --all-files
30-
poetry run ./lint "CHECK"
35+
poetry run task check

.pre-commit-config.yaml

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11
repos:
2-
- repo: https://github.com/ambv/black
3-
rev: stable
4-
hooks:
5-
- id: black
62
- repo: https://github.com/pre-commit/pre-commit-hooks
7-
rev: v2.4.0
3+
rev: v3.4.0
84
hooks:
9-
- id: trailing-whitespace
10-
- id: end-of-file-fixer
11-
- id: mixed-line-ending
5+
- id: check-added-large-files
126
args:
13-
- --fix=lf
14-
- id: check-byte-order-marker
7+
- --maxkb=20
158
- id: check-case-conflict
9+
- id: check-json
1610
- id: check-merge-conflict
17-
- id: check-added-large-files
11+
- id: check-toml
12+
- id: check-yaml
13+
- id: debug-statements
14+
- id: end-of-file-fixer
15+
- id: fix-byte-order-marker
16+
- id: mixed-line-ending
1817
args:
19-
- --maxkb=20
18+
- --fix=lf
19+
- id: trailing-whitespace

.pylintrc

Lines changed: 0 additions & 29 deletions
This file was deleted.

README.md

Lines changed: 27 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,40 @@
11
# XML dataclasses
22

3-
[![License: MPL 2.0](https://img.shields.io/badge/License-MPL%202.0-brightgreen.svg)](https://opensource.org/licenses/MPL-2.0) ![Build](https://github.com/tobywf/xml_dataclasses/workflows/Build/badge.svg?branch=master&event=push)
3+
[![License: MPL 2.0](https://img.shields.io/badge/License-MPL%202.0-brightgreen.svg)](https://opensource.org/licenses/MPL-2.0) ![Build](https://github.com/tobywf/xml_dataclasses/workflows/Build/badge.svg?branch=master)
44

55
[XML dataclasses on PyPI](https://pypi.org/project/xml-dataclasses/)
66

7-
This library enables (de)serialising XML into Python dataclasses. XML dataclasses build on normal dataclasses from the standard library and [`lxml`](https://pypi.org/project/lxml/) elements. Loading and saving these elements is left to the consumer for flexibility of the desired output.
7+
This library maps XML to and from Python dataclasses. It build on normal dataclasses from the standard library and uses [`lxml`](https://pypi.org/project/lxml/) for parsing/generating XML.
88

9-
It's currently in alpha. It isn't ready for production if you aren't willing to do your own evaluation/quality assurance. I don't recommend using this library with untrusted content. It inherits all of `lxml`'s flaws with regards to XML attacks, and recursively resolves data structures. Because deserialisation is driven from the dataclass definitions, it shouldn't be possible to execute arbitrary Python code (not a guarantee, see license). Denial of service attacks would very likely be feasible. One workaround may be to [use `lxml` to validate](https://lxml.de/validation.html) untrusted content with a strict schema.
9+
It's currently in alpha. It isn't ready for production if you aren't willing to do your own evaluation/quality assurance.
1010

1111
Requires Python 3.7 or higher.
1212

1313
## Features
1414

15-
* XML dataclasses are also dataclasses, and only require a single decorator to work (but see type hinting section for issues)
16-
* Convert XML documents to well-defined dataclasses, which should work with IDE auto-completion
15+
* Convert XML documents to well-defined dataclasses, which work with Mypy or IDE auto-completion
16+
* XML dataclasses are dataclasses
17+
* Full control of parsing and generating XML via `lxml`
1718
* Loading and dumping of attributes, child elements, and text content
18-
* Required and optional attributes and child elements
19+
* Required and optional attributes/child elements
1920
* Lists of child elements are supported, as are unions and lists or unions
2021
* Inheritance does work, but has the same limitations as dataclasses. Inheriting from base classes with required fields and declaring optional fields doesn't work due to field order. This isn't recommended
2122
* Namespace support is decent as long as correctly declared. I've tried on several real-world examples, although they were known to be valid. `lxml` does a great job at expanding namespace information when loading and simplifying it when saving
2223
* Post-load validation hook `xml_validate`
2324
* Fields not required in the constructor are ignored by this library (via `ignored()` or `init=False`)
2425

26+
## Limitations
27+
28+
* Whitespace and comments aren't supported in the data model. They must be stripped when loading the XML
29+
* So far, I haven't found any examples where XML can't be mapped to a dataclass, but it's likely possible given how complex XML is
30+
* Strict mapping. Currently, if an unknown element is encountered, an error is raised (see [#3](https://github.com/tobywf/xml_dataclasses/issues/3), pull requests welcome)
31+
* No typing/type conversions. Since XML is untyped, only string values are currently allowed. Type conversions are tricky to implement in a type-safe and extensible manner.
32+
* Dataclasses must be written by hand, no tools are provided to generate these from, DTDs, XML schema definitions, or RELAX NG schemas
33+
34+
## Security
35+
36+
The caveats concerning untrusted content are roughly the same as with `lxml`, since that does the parsing. This is good, since `lxml`'s behaviour to XML attacks are well-understood. This library recursively resolves data structures, which may have memory implications for unbounded payloads. Because loading is driven from the dataclass definitions, it shouldn't be possible to execute arbitrary Python code (not a guarantee, see license). If you must deal with untrusted content, a workaround is to [use `lxml` to validate](https://lxml.de/validation.html) untrusted content with a strict schema, which you may already be doing.
37+
2538
## Patterns
2639

2740
### Defining attributes
@@ -146,7 +159,7 @@ class Container(XmlDataclass):
146159

147160
if __name__ == "__main__":
148161
nsmap: NsMap = {None: CONTAINER_NS}
149-
# see Gotchas, stripping whitespace is highly recommended
162+
# see Gotchas, stripping whitespace and comments is highly recommended
150163
parser = etree.XMLParser(remove_blank_text=True, remove_comments=True)
151164
lxml_el_in = etree.parse("container.xml", parser).getroot()
152165
container = load(Container, lxml_el_in, "container")
@@ -186,26 +199,18 @@ parser = etree.XMLParser(remove_blank_text=True, remove_comments=True)
186199

187200
By default, `lxml` preserves whitespace. This can cause a problem when checking if elements have no text. The library does attempt to strip these; literally via Python's `strip()`. But `lxml` is likely faster and more robust.
188201

189-
Similarly, comments are included by default, and because deserialization is strict, they will be considered as nodes that the dataclass has not declared. It is recommended to omit them during parsing.
202+
Similarly, comments are included by default, and because loading is strict, they will be considered as nodes that the dataclass has not declared. It is recommended to omit them during parsing.
190203

191204
### Optional vs required
192205

193206
On dataclasses, optional fields also usually have a default value to be useful. But this isn't required; `Optional` is just a type hint to say `None` is allowed. This would occur e.g. if an element has no children.
194207

195-
For XML dataclasses, on loading/deserialisation, whether or not a field is required is determined by if it has a `default`/`default_factory` defined. If so, and it's missing, that default is used. Otherwise, an error is raised.
208+
For loading XML dataclasses, whether or not a field is required is determined by if it has a `default`/`default_factory` defined. If so, and it's missing, that default is used. Otherwise, an error is raised.
196209

197-
For dumping/serialisation, the default isn't considered. Instead, if a value is marked as `Optional` and the value is `None`, it isn't written.
210+
For dumping, the default isn't considered. Instead, if a value is marked as `Optional` and the value is `None`, it isn't written.
198211

199212
This makes sense in many cases, but possibly not every case.
200213

201-
### Other limitations and Assumptions
202-
203-
Most of these limitations/assumptions are enforced. They may make this project unsuitable for your use-case.
204-
205-
* If you need to pass any parameters to the wrapped `@dataclass` decorator, apply it before the `@xml_dataclass` decorator
206-
* Deserialisation is strict; missing required attributes and child elements will cause an error. I want this to be the default behaviour, but it should be straightforward to add a parameter to `load` for lenient operation
207-
* Dataclasses must be written by hand, no tools are provided to generate these from, DTDs, XML schema definitions, or RELAX NG schemas
208-
209214
## Changelog
210215

211216
### [0.0.6] - 2020-03-25
@@ -244,13 +249,12 @@ Dependencies are managed via [poetry](https://python-poetry.org/). To install al
244249
poetry install
245250
```
246251

247-
This will also install development dependencies such as `black`, `isort`, `pylint`, `mypy`, and `pytest`. I've provided a simple script to run these during development called `lint`. You can either run it from a shell session with the poetry-installed virtual environment, or run as follows:
252+
This will also install development dependencies such as `black`, `isort`, `pylint`, `mypy`, and `pytest`. Pre-defined tasks make it easy to run these, for example
248253

249-
```
250-
poetry run ./lint
251-
```
254+
* `poetry run task lint` - this runs `black`, `isort`, `mypy`, and `pylint`
255+
* `poetry run task test` - this runs `pytest` with coverage
252256

253-
Auto-formatters will be applied, and static analysis/tests are run in order. The script stops on failure to allow quick iteration.
257+
For a full list of tasks, see `poetry run task --list`.
254258

255259
## License
256260

functional/container_test.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
from pathlib import Path
33
from typing import List
44

5-
import pytest # type: ignore
5+
import pytest
66
from lxml import etree # type: ignore
77

88
from xml_dataclasses import (

lint

Lines changed: 0 additions & 27 deletions
This file was deleted.

pyproject.toml

Lines changed: 61 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -14,41 +14,86 @@ classifiers = [
1414
"Programming Language :: Python :: 3 :: Only",
1515
"Programming Language :: Python :: 3.7",
1616
"Programming Language :: Python :: 3.8",
17+
"Programming Language :: Python :: 3.9",
1718
"Topic :: Text Processing :: Markup :: XML",
1819
]
1920

2021
[tool.poetry.dependencies]
2122
python = "^3.7"
22-
lxml = "^4.5.0"
23+
lxml = "^4.6.3"
2324

2425
[tool.poetry.dev-dependencies]
25-
pytest = "^5.3.5"
26-
black = "^19.10b0"
27-
isort = "^4.3.21"
28-
pylint = "^2.4.4"
29-
pytest-cov = "^2.8.1"
30-
mypy = "^0.761"
31-
ipython = "^7.12.0"
26+
pytest = "^6.2.3"
27+
black = "^20.8b1"
28+
isort = "^5.8.0"
29+
pylint = "^2.7.4"
30+
pytest-cov = "^2.11.1"
31+
mypy = "^0.812"
32+
ipython = "^7.22.0"
3233
coverage = {extras = ["toml"], version = "^5.0.3"}
3334
pytest-random-order = "^1.0.4"
35+
taskipy = "^1.7.0"
3436

3537
[tool.isort]
36-
# see https://black.readthedocs.io/en/stable/the_black_code_style.html
38+
profile = "black"
3739
multi_line_output = 3
38-
include_trailing_comma = true
39-
force_grid_wrap = 0
40-
use_parentheses = true
41-
line_length = 88
42-
43-
indent = ' '
44-
combine_as_imports = true
4540

4641
[tool.coverage.run]
4742
branch = true
4843

4944
[tool.coverage.report]
5045
fail_under = 100
5146

47+
[tool.pytest.ini_options]
48+
testpaths = ["tests"]
49+
addopts = """
50+
--cov=xml_dataclasses \
51+
--cov-report term \
52+
--cov-report html \
53+
--random-order"""
54+
55+
[tool.pylint.master]
56+
extension-pkg-whitelist = "lxml"
57+
ignore = "CVS"
58+
persistent = true
59+
jobs = 1
60+
61+
[tool.pylint.message_control]
62+
63+
disable= """,
64+
bad-continuation,
65+
line-too-long,
66+
ungrouped-imports,
67+
wrong-import-position,
68+
missing-docstring,
69+
fixme,
70+
too-few-public-methods,
71+
"""
72+
73+
[tool.pylint.basic]
74+
good-names = "_,e,el,ex,f,tp,k,v,ns"
75+
76+
[tool.pylint.format]
77+
indent-string = " "
78+
79+
[tool.taskipy.tasks]
80+
isort = "isort -v ."
81+
black = "black ."
82+
mypy = "mypy --strict src/xml_dataclasses/ functional/container_test.py"
83+
pylint = "pylint src/xml_dataclasses/"
84+
85+
lint = "task isort && task black && task mypy && task pylint"
86+
87+
test = "pytest"
88+
functional = "pytest --no-cov functional/"
89+
90+
all = "task lint && task test && task functional"
91+
92+
isort_check = "isort --check-only ."
93+
black_check = "black --check ."
94+
95+
check = "task isort_check && task black_check && task mypy && task pylint && task test && task functional"
96+
5297
[build-system]
5398
requires = ["poetry>=1.0.0"]
5499
build-backend = "poetry.masonry.api"

src/xml_dataclasses/modifiers.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
# pylint: disable=unsubscriptable-object
2-
# unsubscriptable-object clashes with type hints
32
from __future__ import annotations
43

54
from dataclasses import _MISSING_TYPE, MISSING, Field, field

0 commit comments

Comments
 (0)