feat: ✨ add template README in docs/ (#136)

martonvago · lwjohnst86 · pre-commit-ci[bot] · web-flow · commit b2bca1994ec7 · 2025-07-29T14:28:22.000+02:00
# Description This PR adds a template README for developing the Data Package. Closes #100 This PR needs an in-depth review. ## Checklist - [x] Formatted Markdown - [x] Ran `just run-all` --------- Co-authored-by: Luke W. Johnston <lwjohnst86@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
diff --git a/template/README.md.jinja b/template/README.md.jinja
@@ -1,16 +1 @@
-
-## Post-setup steps
-
-- Run `just list-todos` to get a list of TODO items you need to fill out.
-
-## Versioning and changelog
-
-This project uses
-[Commitizen](https://commitizen-tools.github.io/commitizen/) to update
-versions and generate changelogs. Based on the [Conventional
-Commits](https://www.conventionalcommits.org/en/v1.0.0/) message, it
-will automatically update the version in both `pyproject.toml` and
-`datapackage.json`. The [Data Package](https://datapackage.org/)
-standard suggests using their version of [Semantic
-Versioning](https://datapackage.org/recipes/data-package-version/). So
-follow these conventions when making commits to this repository.
+<!-- This file is overwritten whenever `main.py` is run. -->
diff --git a/template/docs/README.md b/template/docs/README.md
@@ -0,0 +1,152 @@
+# A Data Package built with Seedcase packages
+
+This [Data Package](https://datapackage.org/) was generated from the
+[`template-data-package`](https://github.com/seedcase-project/template-data-package)
+Seedcase template.
+
+## Project files and folders
+
+-   `docs/`: Documentation about using and developing the Data Package,
+    including this README file.
+-   `scripts/`: Python scripts for creating and managing the Data
+    Package. Files describing the data will be generated here.
+-   `.copier-answers.yml`: Contains the answers you gave when copying
+    the project from the template. **You should not modify this file
+    directly.**
+-   `.cz.toml`:
+    [Commitizen](https://commitizen-tools.github.io/commitizen/)
+    configuration file for managing versions and changelogs.
+-   `.pre-commit-config.yaml`: [Pre-commit](https://pre-commit.com/)
+    configuration file for managing and running checks before each
+    commit.
+-   `.typos.toml`: [typos](https://github.com/crate-ci/typos) spell
+    checker configuration file.
+-   `CITATION.cff`: Structured citation metadata for your project.
+-   `justfile`: [`just`](https://just.systems/man/en/) configuration
+    file for scripting project tasks.
+-   `main.py`: Central script file for the Data Package. This is where
+    helper scripts are invoked and work together to create and manage
+    the Data Package.
+-   `pyproject.toml`: Main Python project configuration file defining
+    metadata and dependencies.
+-   `README.md`: Autogenerated description of the Data Package. Not a
+    development guide. Information on using and developing the project
+    should be included in the `docs/` folder.
+-   `ruff.toml`: [Ruff](https://docs.astral.sh/ruff/) configuration file
+    for linting and formatting Python code.
+-   `uv.lock`: Lockfile used by [`uv`](https://docs.astral.sh/uv/) to
+    record exact versions of installed dependencies.
+
+## How to develop your Data Package
+
+In your new project generated from the `template-data-package`, the
+first steps for creating and developing your Data Package are already
+set up in `main.py`. For more detailed instructions on using Seedcase
+Sprout to organise your Data Package, see the
+[guide](https://sprout.seedcase-project.org/docs/guide/) on Sprout's
+website. You can read more about the files and folders created by
+`main.py` on the
+[Outputs](https://sprout.seedcase-project.org/docs/design/interface/outputs)
+page of the design documentation.
+
+### Creating package properties
+
+1.  Run `main.py` to create the `scripts/package_properties.py` file for
+    the properties of your Data Package.
+
+    ``` bash
+    just build
+    ```
+
+    You can also run `main.py` by clicking the "Run" button in your IDE.
+
+2.  Open `scripts/package_properties.py` and fill in all required
+    fields. Also fill in any optional fields you find useful. You can
+    always update these later. Make sure to save the file.
+
+3.  In `main.py`, uncomment the lines referencing the
+    `package_properties` and `package_path` variables.
+
+4.  Rerun `main.py` to create the `datapackage.json` and `README.md`
+    files for your Data Package.
+
+### Creating a new resource
+
+#### With data to add to the resource
+
+While you can create resource properties without data, it is a lot more
+challenging. If at all possible, only create a resource properties
+object when you have data to use to at least pre-fill in some of the
+important fields. In order to use Sprout, the data needs to already be
+in a tidy format. When it is, load the data as a Polars data frame into
+the `raw_data` variable in `main.py`.
+
+1.  Uncomment lines up to and including the creation of resource
+    properties.
+
+2.  Fill in the `resource_name` argument.
+
+3.  Rerun `main.py` to create the
+    `scripts/resource_properties_<name>.py` file for the properties of
+    the new resource.
+
+4.  Open `scripts/resource_properties_<name>.py` and fill in all
+    required fields. Also fill in any optional fields you find useful.
+    You can always update these later. Make sure to save the file.
+
+5.  In `package_properties.py`, import your new resource properties by
+    uncommenting and updating it with the name of your resource. Also
+    uncomment the `resources` field and update the name of the resource
+    properties in the array to match the name of your new resource.
+
+6.  In `main.py`, import your new resource properties by uncommenting it
+    and updating it with the name of your resource.
+
+7.  Uncomment everything else in the `main.py` file and rename the
+    `resource_properties` variable to the name of the new resource
+    properties you just imported.
+
+8.  Rerun `main.py`. This will:
+
+    -   Update `datapackage.json` and `README.md`.
+    -   Create a `resources/` folder containing a folder for your new
+        resource. In here, you will find a `batch/` folder with the
+        individual data batches you've uploaded for this resource and a
+        `data.parquet` file containing all resource data.
+
+## How to use the `justfile`
+
+The `justfile` contains scripts or "recipes" that are shorthands for
+performing common project tasks. You can get an overview of available
+recipes by running
+
+``` bash
+just
+```
+
+in the project root.
+
+You can run a recipe by typing
+
+``` bash
+just <recipe-name>
+```
+
+A simple workflow would be running
+
+1.  `just build` repeatedly while working on a new feature to test that
+    it's working
+2.  `just run-all` before submitting your work for review to make sure
+    all checks pass
+
+## Versioning and changelog
+
+This project uses
+[Commitizen](https://commitizen-tools.github.io/commitizen/) to update
+versions and generate changelogs. Based on the [Conventional
+Commits](https://www.conventionalcommits.org/en/v1.0.0/) message, it
+will automatically update the version in both `pyproject.toml` and
+`datapackage.json`. The [Data Package](https://datapackage.org/)
+standard suggests using their version of [Semantic
+Versioning](https://datapackage.org/recipes/data-package-version/). So
+follow these conventions when making commits to this repository.