@@ -5,9 +5,10 @@ jupyter: python3
55order : 3
66---
77
8- You can pass a ` Config ` object to ` check() ` to customise the checks done
9- on your Data Package's properties. The following configuration options
10- are available:
8+ You can pass a [ ` Config ` ] ( /docs/reference/Config.qmd ) object to
9+ [ ` check() ` ] ( /docs/reference/check.qmd ) to customise the checks done on
10+ your Data Package's properties. The following configuration options are
11+ available:
1112
1213- ` version ` : The version of Data Package standard to check against.
1314 Defaults to ` v2 ` .
@@ -39,73 +40,82 @@ the `required` check by defining an `Exclusion` object with this `type`:
3940
4041``` {python}
4142from textwrap import dedent
43+ import pprint
4244import check_datapackage as cdp
4345
4446exclusion_required = cdp.Exclusion(type="required")
47+ exclusion_required
4548```
4649
4750To exclude checks of a specific field or fields, you can use a [ JSON
4851path] ( https://en.wikipedia.org/wiki/JSONPath ) in the ` jsonpath `
49- attribute of an ` Exclusion ` object. For example, you can exclude all
50- checks on the ` name ` field of the Data Package properties by writing:
52+ attribute of an [ ` Exclusion ` ] ( /docs/reference/Exclusion.qmd ) object. For
53+ example, you can exclude all checks on the ` name ` field of the Data
54+ Package properties by writing:
5155
5256``` {python}
5357exclusion_name = cdp.Exclusion(jsonpath="$.name")
58+ exclusion_name
5459```
5560
5661Or you can use the wildcard JSON path selector to exclude checks on the
5762` path ` field of ** all** Data Resource properties:
5863
5964``` {python}
6065exclusion_path = cdp.Exclusion(jsonpath="$.resources[*].path")
66+ exclusion_path
6167```
6268
63- The ` type ` and ` jsonpath ` arguments can also be combined:
69+ The ` type ` and ` jsonpath ` arguments can also be combined, so we can
70+ ignore an [ ` Issue ` ] ( /docs/reference/Issue.qmd ) of a specific type on a
71+ specific field. For example, to exclude checks of whether the ` created ` field
72+ is in a specific format (` type="format" ` ), we can use:
6473
6574``` {python}
66- exclusion_desc_required = cdp.Exclusion(type="required", jsonpath="$.resources[*].description")
75+ exclusion_created_format = cdp.Exclusion(type="format", jsonpath="$.created")
76+ exclusion_created_format
6777```
6878
69- This will exclude required checks on the ` description ` field of Data
70- Resource properties.
71-
7279To apply your exclusions when running the ` check() ` , you add them to the
73- ` Config ` object passed to the ` check() ` function:
80+ ` Config ` object passed to the ` check() ` function. First, let's make an
81+ example that has three ` Issue ` items: the package ` name ` is a number,
82+ the ` created ` field is not a date, and the resource ` path ` doesn't point
83+ to a data file (isn't a real path). So we'll modify our example
84+ ` package_properties ` from
85+ [ ` example_package_properties() ` ] ( /docs/reference/example_package_properties.qmd )
86+ to make these Issues appear:
7487
7588``` {python}
76- package_properties = {
77- "name": 123,
78- "title": "Hibernation Physiology of the Woolly Dormouse: A Scoping Review.",
79- "id": "123-abc-123",
80- "created": "2014-05-14T05:00:01+00:00",
81- "version": "1.0.0",
82- "licenses": [{"name": "odc-pddl"}],
83- "resources": [
84- {
85- "name": "woolly-dormice-2015",
86- "title": "Body fat percentage in the hibernating woolly dormouse",
87- "path": "https://en.wikipedia.org/wiki/Woolly_dormouse",
88- }
89- ],
90- }
91-
92- config = cdp.Config(exclusions=[exclusion_required, exclusion_name, exclusion_path])
93- cdp.check(properties=package_properties, config=config)
89+ package_properties = cdp.example_package_properties()
90+ package_properties["name"] = 123
91+ package_properties["created"] = "not-a-date"
92+ package_properties["resources"][0]["path"] = "\\not/a/path"
93+ pprint.pp(package_properties)
9494```
9595
96- In the example above, we would expect four ` Issue ` items: the package
97- ` name ` is a number, the required ` description ` field is missing in both
98- the package and resource properties, and the resource ` path ` doesn't
99- point to a data file. However, as we have defined exclusions for all of
100- these, the function will flag no issues.
96+ When we run ` check() ` on these properties, we get the three expected issues:
97+
98+ ``` {python}
99+ cdp.check(properties=package_properties)
100+ ```
101+
102+ Now let's exclude these ` Issue ` s so that ` check() ` finds no issues by
103+ adding our exclusions to a ` Config ` object and giving it to ` check() ` :
104+
105+ ``` {python}
106+ config = cdp.Config(exclusions=[exclusion_name, exclusion_path, exclusion_created_format])
107+ cdp.check(properties=package_properties, config=config)
108+ ```
101109
102110## Adding extensions
103111
104112It is possible to add checks in addition to the ones defined in the Data
105113Package standard. We call these additional checks * extensions* . There
106- are currently two types of extensions supported: ` CustomCheck ` and
107- ` RequiredCheck ` . You can add as many ` CustomCheck ` s and ` RequiredCheck ` s
108- to your ` Config ` as you want to fit your needs.
114+ are currently two types of extensions supported:
115+ [ ` CustomCheck ` ] ( /docs/reference/CustomCheck.qmd ) and
116+ [ ` RequiredCheck ` ] ( /docs/reference/RequiredCheck.qmd ) . You can add as
117+ many ` CustomCheck ` s and ` RequiredCheck ` s to your ` Config ` as you want to
118+ fit your needs.
109119
110120### Custom checks
111121
@@ -124,39 +134,16 @@ license_check = cdp.CustomCheck(
124134)
125135```
126136
127- For more details on what each parameter means, see the
128- [ ` CustomCheck ` ] ( /docs/reference/custom_check.qmd ) documentation.
129- Specific to this example, the ` type ` is setting the identifier of the
130- check to ` only-mit ` and the ` jsonpath ` is indicating to only check the
131- ` name ` property of each license in the ` licenses ` property of the Data
132- Package.
137+ For more details on what each parameter means, see the ` CustomCheck `
138+ documentation. Specific to this example, the ` type ` is setting the
139+ identifier of the check to ` only-mit ` and the ` jsonpath ` is indicating
140+ to only check the ` name ` property of each license in the ` licenses `
141+ property of the Data Package.
133142
134143To register your custom checks with the ` check() ` function, you add them
135144to the ` Config ` object passed to the function:
136145
137146``` {python}
138- #| eval: false
139- package_properties = {
140- "name": "woolly-dormice",
141- "title": "Hibernation Physiology of the Woolly Dormouse: A Scoping Review.",
142- "description": dedent("""
143- This scoping review explores the hibernation physiology of the
144- woolly dormouse, drawing on data collected over a 10-year period
145- along the Taurus Mountain range in Turkey.
146- """),
147- "id": "123-abc-123",
148- "created": "2014-05-14T05:00:01+00:00",
149- "version": "1.0.0",
150- "licenses": [{"name": "odc-pddl"}, {"name": "mit"}],
151- "resources": [
152- {
153- "name": "woolly-dormice-2015",
154- "title": "Body fat percentage in the hibernating woolly dormouse",
155- "path": "resources/woolly-dormice-2015/data.parquet",
156- }
157- ],
158- }
159-
160147config = cdp.Config(extensions=cdp.Extensions(custom_checks=[license_check]))
161148cdp.check(properties=package_properties, config=config)
162149```
@@ -173,7 +160,6 @@ with a `RequiredCheck`. For example, if you want to make the
173160` RequiredCheck ` like this:
174161
175162``` {python}
176- #| eval: false
177163description_required = cdp.RequiredCheck(
178164 jsonpath="$.description",
179165 message="The 'description' field is required in the Data Package properties.",
@@ -184,10 +170,13 @@ See the [`RequiredCheck`](/docs/reference/required_check.qmd)
184170documentation for more details on its parameters.
185171
186172To apply this ` RequiredCheck ` , it should be added to the ` Config ` object
187- passed to ` check() ` like shown below:
173+ passed to ` check() ` like shown below. We'll create a
174+ ` package_properties ` without a ` description ` field to see the effect of
175+ this check:
188176
189177``` {python}
190- #| eval: false
178+ package_properties = cdp.example_package_properties()
179+ del package_properties["description"]
191180config = cdp.Config(extensions=cdp.Extensions(required_checks=[description_required]))
192181cdp.check(properties=package_properties, config=config)
193182```
@@ -196,34 +185,16 @@ cdp.check(properties=package_properties, config=config)
196185
197186The Data Package standard includes properties that "MUST" and "SHOULD"
198187be included and/or have a specific format in a compliant Data Package.
199- By default, ` check() ` only the ` check() ` function only includes "MUST"
200- checks. To include "SHOULD" checks, set the ` strict ` argument to ` True ` .
188+ By default, ` check() ` only includes "MUST"
189+ checks. To include "SHOULD" checks, set the ` strict ` argument to ` True `
190+ in the ` Config ` object.
191+
201192For example, the ` name ` field of a Data Package "SHOULD" not contain
202193special characters. So running ` check() ` in strict mode (` strict=True ` )
203- on the following properties would output an issue.
194+ on the following properties would output an ` Issue ` :
204195
205196``` {python}
206- #| eval: false
207- package_properties = {
208- "name": "Woolly Dormice (Toros Dağları)",
209- "title": "Hibernation Physiology of the Woolly Dormouse: A Scoping Review.",
210- "description": dedent("""
211- This scoping review explores the hibernation physiology of the
212- woolly dormouse, drawing on data collected over a 10-year period
213- along the Taurus Mountain range in Turkey.
214- """),
215- "id": "123-abc-123",
216- "created": "2014-05-14T05:00:01+00:00",
217- "version": "1.0.0",
218- "licenses": [{"name": "odc-pddl"}],
219- "resources": [
220- {
221- "name": "woolly-dormice-2015",
222- "title": "Body fat percentage in the hibernating woolly dormouse",
223- "path": "resources/woolly-dormice-2015/data.parquet",
224- }
225- ],
226- }
227-
228- cdp.check(properties=package_properties, strict=True)
197+ package_properties = cdp.example_package_properties()
198+ package_properties["name"] = "data-package!@#"
199+ cdp.check(properties=package_properties, config=cdp.Config(strict=True))
229200```
0 commit comments