Skip to content

Commit b2bca19

Browse files
martonvagolwjohnst86pre-commit-ci[bot]
authored
feat: ✨ add template README in docs/ (#136)
# Description This PR adds a template README for developing the Data Package. Closes #100 This PR needs an in-depth review. ## Checklist - [x] Formatted Markdown - [x] Ran `just run-all` --------- Co-authored-by: Luke W. Johnston <lwjohnst86@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent a9fe518 commit b2bca19

File tree

2 files changed

+153
-16
lines changed

2 files changed

+153
-16
lines changed

template/README.md.jinja

Lines changed: 1 addition & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1 @@
1-
2-
## Post-setup steps
3-
4-
- Run `just list-todos` to get a list of TODO items you need to fill out.
5-
6-
## Versioning and changelog
7-
8-
This project uses
9-
[Commitizen](https://commitizen-tools.github.io/commitizen/) to update
10-
versions and generate changelogs. Based on the [Conventional
11-
Commits](https://www.conventionalcommits.org/en/v1.0.0/) message, it
12-
will automatically update the version in both `pyproject.toml` and
13-
`datapackage.json`. The [Data Package](https://datapackage.org/)
14-
standard suggests using their version of [Semantic
15-
Versioning](https://datapackage.org/recipes/data-package-version/). So
16-
follow these conventions when making commits to this repository.
1+
<!-- This file is overwritten whenever `main.py` is run. -->

template/docs/README.md

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
# A Data Package built with Seedcase packages
2+
3+
This [Data Package](https://datapackage.org/) was generated from the
4+
[`template-data-package`](https://github.com/seedcase-project/template-data-package)
5+
Seedcase template.
6+
7+
## Project files and folders
8+
9+
- `docs/`: Documentation about using and developing the Data Package,
10+
including this README file.
11+
- `scripts/`: Python scripts for creating and managing the Data
12+
Package. Files describing the data will be generated here.
13+
- `.copier-answers.yml`: Contains the answers you gave when copying
14+
the project from the template. **You should not modify this file
15+
directly.**
16+
- `.cz.toml`:
17+
[Commitizen](https://commitizen-tools.github.io/commitizen/)
18+
configuration file for managing versions and changelogs.
19+
- `.pre-commit-config.yaml`: [Pre-commit](https://pre-commit.com/)
20+
configuration file for managing and running checks before each
21+
commit.
22+
- `.typos.toml`: [typos](https://github.com/crate-ci/typos) spell
23+
checker configuration file.
24+
- `CITATION.cff`: Structured citation metadata for your project.
25+
- `justfile`: [`just`](https://just.systems/man/en/) configuration
26+
file for scripting project tasks.
27+
- `main.py`: Central script file for the Data Package. This is where
28+
helper scripts are invoked and work together to create and manage
29+
the Data Package.
30+
- `pyproject.toml`: Main Python project configuration file defining
31+
metadata and dependencies.
32+
- `README.md`: Autogenerated description of the Data Package. Not a
33+
development guide. Information on using and developing the project
34+
should be included in the `docs/` folder.
35+
- `ruff.toml`: [Ruff](https://docs.astral.sh/ruff/) configuration file
36+
for linting and formatting Python code.
37+
- `uv.lock`: Lockfile used by [`uv`](https://docs.astral.sh/uv/) to
38+
record exact versions of installed dependencies.
39+
40+
## How to develop your Data Package
41+
42+
In your new project generated from the `template-data-package`, the
43+
first steps for creating and developing your Data Package are already
44+
set up in `main.py`. For more detailed instructions on using Seedcase
45+
Sprout to organise your Data Package, see the
46+
[guide](https://sprout.seedcase-project.org/docs/guide/) on Sprout's
47+
website. You can read more about the files and folders created by
48+
`main.py` on the
49+
[Outputs](https://sprout.seedcase-project.org/docs/design/interface/outputs)
50+
page of the design documentation.
51+
52+
### Creating package properties
53+
54+
1. Run `main.py` to create the `scripts/package_properties.py` file for
55+
the properties of your Data Package.
56+
57+
``` bash
58+
just build
59+
```
60+
61+
You can also run `main.py` by clicking the "Run" button in your IDE.
62+
63+
2. Open `scripts/package_properties.py` and fill in all required
64+
fields. Also fill in any optional fields you find useful. You can
65+
always update these later. Make sure to save the file.
66+
67+
3. In `main.py`, uncomment the lines referencing the
68+
`package_properties` and `package_path` variables.
69+
70+
4. Rerun `main.py` to create the `datapackage.json` and `README.md`
71+
files for your Data Package.
72+
73+
### Creating a new resource
74+
75+
#### With data to add to the resource
76+
77+
While you can create resource properties without data, it is a lot more
78+
challenging. If at all possible, only create a resource properties
79+
object when you have data to use to at least pre-fill in some of the
80+
important fields. In order to use Sprout, the data needs to already be
81+
in a tidy format. When it is, load the data as a Polars data frame into
82+
the `raw_data` variable in `main.py`.
83+
84+
1. Uncomment lines up to and including the creation of resource
85+
properties.
86+
87+
2. Fill in the `resource_name` argument.
88+
89+
3. Rerun `main.py` to create the
90+
`scripts/resource_properties_<name>.py` file for the properties of
91+
the new resource.
92+
93+
4. Open `scripts/resource_properties_<name>.py` and fill in all
94+
required fields. Also fill in any optional fields you find useful.
95+
You can always update these later. Make sure to save the file.
96+
97+
5. In `package_properties.py`, import your new resource properties by
98+
uncommenting and updating it with the name of your resource. Also
99+
uncomment the `resources` field and update the name of the resource
100+
properties in the array to match the name of your new resource.
101+
102+
6. In `main.py`, import your new resource properties by uncommenting it
103+
and updating it with the name of your resource.
104+
105+
7. Uncomment everything else in the `main.py` file and rename the
106+
`resource_properties` variable to the name of the new resource
107+
properties you just imported.
108+
109+
8. Rerun `main.py`. This will:
110+
111+
- Update `datapackage.json` and `README.md`.
112+
- Create a `resources/` folder containing a folder for your new
113+
resource. In here, you will find a `batch/` folder with the
114+
individual data batches you've uploaded for this resource and a
115+
`data.parquet` file containing all resource data.
116+
117+
## How to use the `justfile`
118+
119+
The `justfile` contains scripts or "recipes" that are shorthands for
120+
performing common project tasks. You can get an overview of available
121+
recipes by running
122+
123+
``` bash
124+
just
125+
```
126+
127+
in the project root.
128+
129+
You can run a recipe by typing
130+
131+
``` bash
132+
just <recipe-name>
133+
```
134+
135+
A simple workflow would be running
136+
137+
1. `just build` repeatedly while working on a new feature to test that
138+
it's working
139+
2. `just run-all` before submitting your work for review to make sure
140+
all checks pass
141+
142+
## Versioning and changelog
143+
144+
This project uses
145+
[Commitizen](https://commitizen-tools.github.io/commitizen/) to update
146+
versions and generate changelogs. Based on the [Conventional
147+
Commits](https://www.conventionalcommits.org/en/v1.0.0/) message, it
148+
will automatically update the version in both `pyproject.toml` and
149+
`datapackage.json`. The [Data Package](https://datapackage.org/)
150+
standard suggests using their version of [Semantic
151+
Versioning](https://datapackage.org/recipes/data-package-version/). So
152+
follow these conventions when making commits to this repository.

0 commit comments

Comments
 (0)