Skip to content
Merged
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 65 additions & 6 deletions docs/design/architecture.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,6 @@ things consistent. However, we also introduce some new terms and
concepts specific to `check-datapackage`. The main objects and actions
used throughout the package can be found in the tables below.

### Objects

| Object | Description |
|----------------------------|--------------------------------------------|
| package | A Data Package that contains a collection of related data resources and descriptor(s). |
Expand All @@ -51,8 +49,6 @@ used throughout the package can be found in the tables below.

: Objects used throughout `check-datapackage`.

### Actions

| Action | Description |
|----------------------------|--------------------------------------------|
| check | Check that properties comply with the Data Package standard. |
Expand All @@ -61,6 +57,69 @@ used throughout the package can be found in the tables below.

: Actions that `check-datapackage` can perform.

### "Why "check" (and not "validate" or "verify")?

If you have ever searched for tools that check something against a
specification, you'll often see the word "validate". You might also
notice that we don't use the word "validate" in our package and
documentation. This is intentional.

Although the word "validate" is ubiquitous in programming, it's often
used loosely and in ways that don't align from its actual meaning. Tools
that "validate" something often, in practice, *verify* that something
matches a defined expectation or specification. These two verbs are
often used during the quality control or assurance stages of building
products. There are many websites and articles comparing the difference
between validate and verify. For a good overview, see the Wikipedia on
this topic in
[general](https://en.wikipedia.org/wiki/Verification_and_validation) and
on
[software](https://en.wikipedia.org/wiki/Software_verification_and_validation)
specifically.

To "validate" is to assess the overlap between a human need and a
solution that solves that need. It answers the question "have we
selected the correct solution to solving the actual problem"? To
"verify" is to ensure that the solution is being built correctly and
with high quality, based on best practices, regulations, and
specifications. It answers the question "are we developing the solution
correctly"? An easy way to tell them apart is that you can often
automate the verification process, while you often need to have
extensive manual evaluation, judgment, and review for the validation
process.

Sometimes, these two overlap, but often they don't. For example, if you
are creating a specification that accurately describes something that
will solve a need you have, then you are likely both validating and
verifying the product that you will build. However, if someone else
takes your specification to help them build their own product and they
don't seek out extensive validation from the humans who will use that
product, they are now likely verifying their product, rather than
validating it.

For many generic software tools that do checks, they often use the word
"validate". For example, in the
[`frictionless`](https://pypi.org/project/frictionless/) package, they
have a command called
["validate"](https://framework.frictionlessdata.io/docs/guides/validating-data.html).
Even widely used tools like
[Pydantic](https://docs.pydantic.dev/latest/) use the word "validate".
However, without knowing the human context, what they are most likely
doing is verifying that something matches some other specification or
expectation. Most of these software tools that "validate" can't know if
the user is using it to "validate" something since only human judgement
and review can answer that. We can, however, be sure that the tool is
verifying something.

Unfortunately, "verify" and "validate" are often used interchangeably
and because of that it can be difficult to distinguish between their
meanings. This may be due to the similarity in their spelling and
pronunciation. For that reason, we've decided to use neither of those
words. Instead, we wanted to use a more common word that reflects what
we want this package to do while also being generic enough to encompass
different uses. So we went with "check", since this package *checks*
that the metadata is correct (based on the specification).

## C4 Models

This section contains the [C4 Models](https://c4model.com/) for
Expand All @@ -78,8 +137,8 @@ Data Package standard.
`check-datapackage` receives the definitions of the Data Package
descriptor's structure---including properties that [must or
should](https://datapackage.org/standard/data-package/#language) be
included and their formats---from the Data Package standard (version 2). The
standard provides this information through versioned JSON Schema
included and their formats---from the Data Package standard (version 2).
The standard provides this information through versioned JSON Schema
profiles that define required properties and textual descriptions that
outline compliance.

Expand Down
Loading