You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
`attr()` and `child()` are no longer required, and are inferred. The type resolving has been improved, I think it's more robust now, although incorrectly defined attributes will be harder to debug.
# WARNING: this is an incomplete implementation of an OPF container
48
-
# (it's missing links)
49
48
50
49
51
50
if__name__=="__main__":
@@ -62,10 +61,61 @@ if __name__ == "__main__":
62
61
* Convert XML documents to well-defined dataclasses, which should work with IDE auto-completion
63
62
* Loading and dumping of attributes, child elements, and text content
64
63
* Required and optional attributes and child elements
65
-
* Lists of child elements are supported
64
+
* Lists of child elements are supported, as are unions and lists or unions
66
65
* Inheritance does work, but has the same limitations as dataclasses. Inheriting from base classes with required fields and declaring optional fields doesn't work due to field order. This isn't recommended
67
66
* Namespace support is decent as long as correctly declared. I've tried on several real-world examples, although they were known to be valid. `lxml` does a great job at expanding namespace information when loading and simplifying it when saving
68
-
* Union child types are supported. When loading XML, they are attempted to be parsed in order
67
+
68
+
## Patterns
69
+
70
+
### Defining attributes
71
+
72
+
Attributes can be either `str` or `Optional[str]`. Using any other type won't work. Attributes can be renamed or have their namespace modified via the `rename` function. It can be used either on its own, or with an existing field definition:
I would like to add support for validation in future, which might also make it easier to support other types. For now, you can work around this limitation with properties that do the conversion.
86
+
87
+
### Defining text
88
+
89
+
Like attributes, text can be either `str` or `Optional[str]`. You must declare text content with the `text` function. Similar to `rename`, this function can use an existing field definition, or take the `default` argument. Text cannot be renamed or namespaced. Every class can only have one field defining text content. If a class has text content, it cannot have any children.
Children must ultimately be other XML dataclasses. However, they can also be `Optional`, `List`, and `Union` types:
111
+
112
+
*`Optional` must be at the top level. Valid: `Optional[List[XmlDataclass]]`. Invalid: `List[Optional[XmlDataclass]]`
113
+
* Next, `List` should be defined (if multiple child elements are allowed). Valid: `List[Union[XmlDataclass1, XmlDataclass2]]`. Invalid: `Union[List[XmlDataclass1], XmlDataclass2]`
114
+
* Finally, if `Optional` or `List` were used, a union type should be the inner-most (again, if needed)
115
+
116
+
Children can be renamed via the `rename` function, however attempting to set a namespace is invalid, since the namespace is provided by the child type's XML dataclass. Also, unions of XML dataclasses must have the same namespace (you can use different fields if they have different namespaces).
117
+
118
+
If a class has children, it cannot have text content.
By default, `lxml` preserves whitespace. This can cause a problem when checking if elements have no text. The library does attempt to strip these; literally via Python's `strip()`. But `lxml` is likely faster and more robust.
81
131
82
-
## Limitations and Assumptions
132
+
### Optional vs required
133
+
134
+
On dataclasses, optional fields also usually have a default value to be useful. But this isn't required; `Optional` is just a type hint to say `None` is allowed.
135
+
136
+
For XML dataclasses, on loading/deserialisation, whether or not a field is required is determined by if it has a `default`/`default_factory` defined. If so, and it's missing, that default is used. Otherwise, an error is raised.
137
+
138
+
For dumping/serialisation, the default isn't considered. Instead, if a value is marked as `Optional` and the value is `None`, it isn't written.
139
+
140
+
This makes sense in many cases, but possibly not every case.
141
+
142
+
### Other limitations and Assumptions
83
143
84
144
Most of these limitations/assumptions are enforced. They may make this project unsuitable for your use-case.
85
145
86
-
* All attributes are strings, no extra validation is performed. I would like to add support for validation in future, which might also make it easier to support other types
87
-
* Elements can either have child elements or text content, not both
88
-
* Child elements are other XML dataclasses
89
-
* Text content is a string
90
146
* It isn't possible to pass any parameters to the wrapped `@dataclass` decorator
91
-
*Some properties of dataclass `field`s are not exposed: `default_factory`, `repr`, `hash`, `init`, `compare`. For most, it is because I don't understand the implications fully or how that would be useful for XML. `default_factory` is hard only because of [the overloaded type signatures](https://github.com/python/typeshed/blob/master/stdlib/3.7/dataclasses.pyi), and getting that to work with `mypy`
147
+
*Setting the `init` parameter of a dataclass' `field` will lead to bad things happening, this isn't supported
92
148
* Deserialisation is strict; missing required attributes and child elements will cause an error. I want this to be the default behaviour, but it should be straightforward to add a parameter to `load` for lenient operation
93
149
* Dataclasses must be written by hand, no tools are provided to generate these from, DTDs, XML schema definitions, or RELAX NG schemas
94
-
* Union types must have the same element/tag name and namespace. Otherwise, two different dataclass attributes (XML child fields) may be used
0 commit comments