Skip to content

Commit 159a9ab

Browse files
lwjohnst86martonvagosignekb
authored
docs: 📝 revise guide docs to use example_* (plus minor edits) (#195)
# Description Some minor edits after converting the guide docs to use `example_package_properties()` Closes #164 Needs an in-depth review. ## Checklist - [x] Formatted Markdown - [x] Ran `just run-all` --------- Co-authored-by: martonvago <57952344+martonvago@users.noreply.github.com> Co-authored-by: Signe Kirk Brødbæk <signebroedbaek@gmail.com>
1 parent fc67bf8 commit 159a9ab

File tree

2 files changed

+86
-126
lines changed

2 files changed

+86
-126
lines changed

docs/guide/check.qmd

Lines changed: 21 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,12 @@ metadata---stored in its `datapackage.json` file---complies with the
1111
the available properties at each level of the `datapackage.json`, which
1212
ones are required, and what values are allowed.
1313

14-
This guide shows you how to use the main function `check()` to run these
15-
checks. Each section walks you through a different part of `check()`,
16-
starting with its basic usage with the `properties` argument,
17-
introducing the default checks, how to configure which checks you want
18-
to run with the `config` argument, and how to handle failed checks with
19-
the `error` argument.
14+
This guide shows you how to use the main function
15+
[`check()`](/docs/reference/check.qmd) to run these checks. Each section
16+
walks you through a different part of `check()`, starting with its basic
17+
usage with the `properties` argument, introducing the default checks,
18+
how to configure which checks you want to run with the `config`
19+
argument, and how to handle failed checks with the `error` argument.
2020

2121
::: callout-tip
2222
For the full reference of the `check()` function, see the [reference
@@ -43,33 +43,21 @@ section below.
4343

4444
Let's look at an example. The code below defines a `package_properties`
4545
dictionary that includes all the required properties in a correct
46-
format. When we call `check()` on these properties, it returns an empty
47-
list:
46+
format. The example looks like this (from the
47+
[`example_package_properties()`](/docs/reference/example_package_properties.qmd):
4848

4949
```{python}
5050
import check_datapackage as cdp
51+
import pprint
5152
52-
package_properties = {
53-
"name": "woolly-dormice",
54-
"title": "Hibernation Physiology of the Woolly Dormouse: A Scoping Review.",
55-
"description": """
56-
This scoping review explores the hibernation physiology of the
57-
woolly dormouse, drawing on data collected over a 10-year period
58-
along the Taurus Mountain range in Turkey.
59-
""",
60-
"id": "123-abc-123",
61-
"created": "2014-05-14T05:00:01+00:00",
62-
"version": "1.0.0",
63-
"licenses": [{"name": "odc-pddl"}],
64-
"resources": [
65-
{
66-
"name": "woolly-dormice-2015",
67-
"title": "Body fat percentage in the hibernating woolly dormouse",
68-
"path": "resources/woolly-dormice-2015/data.parquet",
69-
}
70-
],
71-
}
7253
54+
package_properties = cdp.example_package_properties()
55+
pprint.pp(package_properties)
56+
```
57+
58+
When we call `check()` on these properties, it returns an empty list:
59+
60+
```{python}
7361
cdp.check(properties=package_properties)
7462
```
7563

@@ -82,16 +70,17 @@ package_properties["name"] = 123
8270
cdp.check(properties=package_properties)
8371
```
8472

85-
The output now lists two issues: one for the missing `description` field
86-
and one for the `name` field of the wrong type.
73+
The output now lists one `Issue` for the `name` field being of the
74+
wrong type.
8775

8876
## Default checks and configuration (`config`)
8977

9078
By default, `check()` runs the standard checks defined as `MUST`s in the
9179
Data Package standard. These include checking that all required
9280
properties are present and that their values have the correct types and
93-
formats. This happens through a default `Config` object passed to the
94-
`config` argument of `check()`.
81+
formats. This happens through a default
82+
[`Config`](/docs/reference/Config.qmd) object passed to the `config`
83+
argument of `check()`.
9584

9685
If you want to configure which checks are performed, you can provide
9786
your own `Config` object in `check()`. With this object you can exclude

docs/guide/config.qmd

Lines changed: 65 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,10 @@ jupyter: python3
55
order: 3
66
---
77

8-
You can pass a `Config` object to `check()` to customise the checks done
9-
on your Data Package's properties. The following configuration options
10-
are available:
8+
You can pass a [`Config`](/docs/reference/Config.qmd) object to
9+
[`check()`](/docs/reference/check.qmd) to customise the checks done on
10+
your Data Package's properties. The following configuration options are
11+
available:
1112

1213
- `version`: The version of Data Package standard to check against.
1314
Defaults to `v2`.
@@ -39,73 +40,82 @@ the `required` check by defining an `Exclusion` object with this `type`:
3940

4041
```{python}
4142
from textwrap import dedent
43+
import pprint
4244
import check_datapackage as cdp
4345
4446
exclusion_required = cdp.Exclusion(type="required")
47+
exclusion_required
4548
```
4649

4750
To exclude checks of a specific field or fields, you can use a [JSON
4851
path](https://en.wikipedia.org/wiki/JSONPath) in the `jsonpath`
49-
attribute of an `Exclusion` object. For example, you can exclude all
50-
checks on the `name` field of the Data Package properties by writing:
52+
attribute of an [`Exclusion`](/docs/reference/Exclusion.qmd) object. For
53+
example, you can exclude all checks on the `name` field of the Data
54+
Package properties by writing:
5155

5256
```{python}
5357
exclusion_name = cdp.Exclusion(jsonpath="$.name")
58+
exclusion_name
5459
```
5560

5661
Or you can use the wildcard JSON path selector to exclude checks on the
5762
`path` field of **all** Data Resource properties:
5863

5964
```{python}
6065
exclusion_path = cdp.Exclusion(jsonpath="$.resources[*].path")
66+
exclusion_path
6167
```
6268

63-
The `type` and `jsonpath` arguments can also be combined:
69+
The `type` and `jsonpath` arguments can also be combined, so we can
70+
ignore an [`Issue`](/docs/reference/Issue.qmd) of a specific type on a
71+
specific field. For example, to exclude checks of whether the `created` field
72+
is in a specific format (`type="format"`), we can use:
6473

6574
```{python}
66-
exclusion_desc_required = cdp.Exclusion(type="required", jsonpath="$.resources[*].description")
75+
exclusion_created_format = cdp.Exclusion(type="format", jsonpath="$.created")
76+
exclusion_created_format
6777
```
6878

69-
This will exclude required checks on the `description` field of Data
70-
Resource properties.
71-
7279
To apply your exclusions when running the `check()`, you add them to the
73-
`Config` object passed to the `check()` function:
80+
`Config` object passed to the `check()` function. First, let's make an
81+
example that has three `Issue` items: the package `name` is a number,
82+
the `created` field is not a date, and the resource `path` doesn't point
83+
to a data file (isn't a real path). So we'll modify our example
84+
`package_properties` from
85+
[`example_package_properties()`](/docs/reference/example_package_properties.qmd)
86+
to make these Issues appear:
7487

7588
```{python}
76-
package_properties = {
77-
"name": 123,
78-
"title": "Hibernation Physiology of the Woolly Dormouse: A Scoping Review.",
79-
"id": "123-abc-123",
80-
"created": "2014-05-14T05:00:01+00:00",
81-
"version": "1.0.0",
82-
"licenses": [{"name": "odc-pddl"}],
83-
"resources": [
84-
{
85-
"name": "woolly-dormice-2015",
86-
"title": "Body fat percentage in the hibernating woolly dormouse",
87-
"path": "https://en.wikipedia.org/wiki/Woolly_dormouse",
88-
}
89-
],
90-
}
91-
92-
config = cdp.Config(exclusions=[exclusion_required, exclusion_name, exclusion_path])
93-
cdp.check(properties=package_properties, config=config)
89+
package_properties = cdp.example_package_properties()
90+
package_properties["name"] = 123
91+
package_properties["created"] = "not-a-date"
92+
package_properties["resources"][0]["path"] = "\\not/a/path"
93+
pprint.pp(package_properties)
9494
```
9595

96-
In the example above, we would expect four `Issue` items: the package
97-
`name` is a number, the required `description` field is missing in both
98-
the package and resource properties, and the resource `path` doesn't
99-
point to a data file. However, as we have defined exclusions for all of
100-
these, the function will flag no issues.
96+
When we run `check()` on these properties, we get the three expected issues:
97+
98+
```{python}
99+
cdp.check(properties=package_properties)
100+
```
101+
102+
Now let's exclude these `Issue`s so that `check()` finds no issues by
103+
adding our exclusions to a `Config` object and giving it to `check()`:
104+
105+
```{python}
106+
config = cdp.Config(exclusions=[exclusion_name, exclusion_path, exclusion_created_format])
107+
cdp.check(properties=package_properties, config=config)
108+
```
101109

102110
## Adding extensions
103111

104112
It is possible to add checks in addition to the ones defined in the Data
105113
Package standard. We call these additional checks *extensions*. There
106-
are currently two types of extensions supported: `CustomCheck` and
107-
`RequiredCheck`. You can add as many `CustomCheck`s and `RequiredCheck`s
108-
to your `Config` as you want to fit your needs.
114+
are currently two types of extensions supported:
115+
[`CustomCheck`](/docs/reference/CustomCheck.qmd) and
116+
[`RequiredCheck`](/docs/reference/RequiredCheck.qmd). You can add as
117+
many `CustomCheck`s and `RequiredCheck`s to your `Config` as you want to
118+
fit your needs.
109119

110120
### Custom checks
111121

@@ -124,39 +134,16 @@ license_check = cdp.CustomCheck(
124134
)
125135
```
126136

127-
For more details on what each parameter means, see the
128-
[`CustomCheck`](/docs/reference/custom_check.qmd) documentation.
129-
Specific to this example, the `type` is setting the identifier of the
130-
check to `only-mit` and the `jsonpath` is indicating to only check the
131-
`name` property of each license in the `licenses` property of the Data
132-
Package.
137+
For more details on what each parameter means, see the `CustomCheck`
138+
documentation. Specific to this example, the `type` is setting the
139+
identifier of the check to `only-mit` and the `jsonpath` is indicating
140+
to only check the `name` property of each license in the `licenses`
141+
property of the Data Package.
133142

134143
To register your custom checks with the `check()` function, you add them
135144
to the `Config` object passed to the function:
136145

137146
```{python}
138-
#| eval: false
139-
package_properties = {
140-
"name": "woolly-dormice",
141-
"title": "Hibernation Physiology of the Woolly Dormouse: A Scoping Review.",
142-
"description": dedent("""
143-
This scoping review explores the hibernation physiology of the
144-
woolly dormouse, drawing on data collected over a 10-year period
145-
along the Taurus Mountain range in Turkey.
146-
"""),
147-
"id": "123-abc-123",
148-
"created": "2014-05-14T05:00:01+00:00",
149-
"version": "1.0.0",
150-
"licenses": [{"name": "odc-pddl"}, {"name": "mit"}],
151-
"resources": [
152-
{
153-
"name": "woolly-dormice-2015",
154-
"title": "Body fat percentage in the hibernating woolly dormouse",
155-
"path": "resources/woolly-dormice-2015/data.parquet",
156-
}
157-
],
158-
}
159-
160147
config = cdp.Config(extensions=cdp.Extensions(custom_checks=[license_check]))
161148
cdp.check(properties=package_properties, config=config)
162149
```
@@ -173,7 +160,6 @@ with a `RequiredCheck`. For example, if you want to make the
173160
`RequiredCheck` like this:
174161

175162
```{python}
176-
#| eval: false
177163
description_required = cdp.RequiredCheck(
178164
jsonpath="$.description",
179165
message="The 'description' field is required in the Data Package properties.",
@@ -184,10 +170,13 @@ See the [`RequiredCheck`](/docs/reference/required_check.qmd)
184170
documentation for more details on its parameters.
185171

186172
To apply this `RequiredCheck`, it should be added to the `Config` object
187-
passed to `check()` like shown below:
173+
passed to `check()` like shown below. We'll create a
174+
`package_properties` without a `description` field to see the effect of
175+
this check:
188176

189177
```{python}
190-
#| eval: false
178+
package_properties = cdp.example_package_properties()
179+
del package_properties["description"]
191180
config = cdp.Config(extensions=cdp.Extensions(required_checks=[description_required]))
192181
cdp.check(properties=package_properties, config=config)
193182
```
@@ -196,34 +185,16 @@ cdp.check(properties=package_properties, config=config)
196185

197186
The Data Package standard includes properties that "MUST" and "SHOULD"
198187
be included and/or have a specific format in a compliant Data Package.
199-
By default, `check()` only the `check()` function only includes "MUST"
200-
checks. To include "SHOULD" checks, set the `strict` argument to `True`.
188+
By default, `check()` only includes "MUST"
189+
checks. To include "SHOULD" checks, set the `strict` argument to `True`
190+
in the `Config` object.
191+
201192
For example, the `name` field of a Data Package "SHOULD" not contain
202193
special characters. So running `check()` in strict mode (`strict=True`)
203-
on the following properties would output an issue.
194+
on the following properties would output an `Issue`:
204195

205196
```{python}
206-
#| eval: false
207-
package_properties = {
208-
"name": "Woolly Dormice (Toros Dağları)",
209-
"title": "Hibernation Physiology of the Woolly Dormouse: A Scoping Review.",
210-
"description": dedent("""
211-
This scoping review explores the hibernation physiology of the
212-
woolly dormouse, drawing on data collected over a 10-year period
213-
along the Taurus Mountain range in Turkey.
214-
"""),
215-
"id": "123-abc-123",
216-
"created": "2014-05-14T05:00:01+00:00",
217-
"version": "1.0.0",
218-
"licenses": [{"name": "odc-pddl"}],
219-
"resources": [
220-
{
221-
"name": "woolly-dormice-2015",
222-
"title": "Body fat percentage in the hibernating woolly dormouse",
223-
"path": "resources/woolly-dormice-2015/data.parquet",
224-
}
225-
],
226-
}
227-
228-
cdp.check(properties=package_properties, strict=True)
197+
package_properties = cdp.example_package_properties()
198+
package_properties["name"] = "data-package!@#"
199+
cdp.check(properties=package_properties, config=cdp.Config(strict=True))
229200
```

0 commit comments

Comments
 (0)