Skip to content
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 64 additions & 2 deletions docs/design/architecture.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,68 @@ things consistent. However, we also introduce some new terms and
concepts specific to `check-datapackage`. The main objects and actions
used throughout the package can be found in the tables below.

### "Check" (vs "validate")

If you have ever searched for tools that check something against a
specification, you'll often see the word "validate". And you might
notice that we don't use the word "validate" in our package and
documentation. This is intentional.

Although the word is ubiquitous in programming, it's often used loosely
and in ways that deviate from its actual meaning. Tools that describe
themselves as "validating" something often, in practice, *verify* that
something matches a defined expectation or specification. There are many
websites and articles comparing the difference between validate and
verify. These two verbs often are done during quality control or
assurance stages of building products and after are simplified to "V&V".
For a good overview, see the Wikipedia on this topic in
[general](https://en.wikipedia.org/wiki/Verification_and_validation) and
on
[software](https://en.wikipedia.org/wiki/Software_verification_and_validation)
specifically.

To "validate" is to assess the overlap between a human need and a
product that solves that need. It answers the question "have we selected
the correct solution to solving the actual problem"? To "verify" is to
ensure that the solution is being built correctly and with high quality,
based on best practices, regulations, and specifications. It answers the
question "are we developing the solution correctly"? An easy way to tell
them apart is that you can often make the verification process
automatic, while you often need to have extensive manual evaluation,
judgment, and review for the validation process.

Sometimes, these two overlap, but often they don't. For example, if you
are involved in creating a specification that accurately describes
something that will solve a need you have, then you are both validating
and verifying the product that you will build. However, if someone else
takes your specification to help them build their own product and they
don't seek our extensive validation from the humans who will use that
product, they are now likely verifying their product, rather than
validating it.

For many generic software tools that do checks, they often use the word
"validate". For example, in the
[`frictionless`](https://pypi.org/project/frictionless/) package, they
have a command called
["validate"](https://framework.frictionlessdata.io/docs/guides/validating-data.html).
Even widely used tools like
[Pydantic](https://docs.pydantic.dev/latest/) use the word "validate".
However, without knowing the human context, what they are most likely
doing is verifying that something matches some other specification or
expectation. Most of these software tool that "validate" can't know if
they are truly "validating" something since only human judgement and
review can answer that. We can, however, be sure that we're verifying or
checking something.

Unfortunately, "verify" and "validate" are often used interchangeably
and are difficult to distinguish between their two meanings. This may be
due to the similarity in their spelling and pronunciation. For that
reason, we decided to use neither word. Instead we wanted to use a more
common word, that accurately reflects what we want this package to do
while also being generic enough to encompass different uses. So we went
with "check", since we "check" that the metadata is correct (based on
the specification).

### Objects

| Object | Description |
Expand Down Expand Up @@ -78,8 +140,8 @@ Data Package standard.
`check-datapackage` receives the definitions of the Data Package
descriptor's structure---including properties that [must or
should](https://datapackage.org/standard/data-package/#language) be
included and their formats---from the Data Package standard (version 2). The
standard provides this information through versioned JSON Schema
included and their formats---from the Data Package standard (version 2).
The standard provides this information through versioned JSON Schema
profiles that define required properties and textual descriptions that
outline compliance.

Expand Down
Loading