|
| 1 | +--- |
| 2 | +author: |
| 3 | + - name: "Charlotte Wickham" |
| 4 | +title: "From One Notebook to Many Reports: Parameterized reports with the `jupyter` engine" |
| 5 | +description: | |
| 6 | + Learn how to transform a single Jupyter notebook into a parameterized report generator that automatically creates customized outputs for different scenarios. |
| 7 | +date: "2025-07-24" |
| 8 | +categories: |
| 9 | + - Authoring |
| 10 | + - Teaching |
| 11 | + - Jupyter |
| 12 | +image: thumbnail.png |
| 13 | +image-alt: | |
| 14 | + A slide with a screenshot of a Jupyter notebook with a graph and text, then an arrow pointing to a stack of PDF files each with a graph and text. |
| 15 | +lightbox: true |
| 16 | +--- |
| 17 | + |
| 18 | +::: callout-tip |
| 19 | + |
| 20 | +## Based on a talk at SciPy 2025 |
| 21 | + |
| 22 | +This post is based on the talk "From One Notebook to Many Reports: Automating with Quarto" delivered at [SciPy 2025](https://www.scipy2025.scipy.org) by Charlotte Wickham. |
| 23 | +You can find the slides at [cwickham.github.io/one-notebook-many-reports](https://cwickham.github.io/one-notebook-many-reports/) and example code at [github.com/cwickham/one-notebook-many-reports](https://github.com/cwickham/one-notebook-many-reports). |
| 24 | + |
| 25 | +::: |
| 26 | + |
| 27 | +## The Problem: Repetitive Reporting |
| 28 | + |
| 29 | +Would you rather read a generic "Climate summary" or a "Climate summary for _exactly where you live_"? Reports that are personalized to a specific situation increase engagement and connection. But producing many customized reports manually is tedious and error-prone. |
| 30 | + |
| 31 | +Quarto solves this with parameterized reports---you create a single document template, then render it multiple times with different parameter values to generate customized outputs automatically. |
| 32 | + |
| 33 | +A great example is the customized soil health reports from Washington Soil Health Initiative's [State of the Soils Assessment](https://washingtonsoilhealthinitiative.com/state-of-the-soils/), presented at posit::conf(2023) by [Jadey Ryan](https://jadeyryan.com) (watch on [YouTube](https://youtu.be/lbE5uOqfT70?si=C-d5U5Q2VXo1wlDs)). Jadey demonstrated this approach using R and plain text Quarto files (`.qmd`). |
| 34 | + |
| 35 | +This post shows you how to apply the same principles using Python: we'll walk through converting a Jupyter notebook (`.ipynb`) into a parameterized report, then automating the generation of multiple customized outputs. Then I'll give you some tips for making your reports look polished. |
| 36 | + |
| 37 | +## The Solution: Parameterized Reports |
| 38 | + |
| 39 | +### Start with a notebook |
| 40 | + |
| 41 | +As an example, let's start with a Jupyter notebook analyzing climate data for Corvallis, Oregon. |
| 42 | + |
| 43 | +](corvallis-ipynb.png){#corvallis-ipynb .column-margin fig-alt="Screenshot of a Jupyter notebook with code cells and output, including a plot and text summary."} |
| 44 | + |
| 45 | +You can see the full notebook, [`corvallis.ipynb`, on GitHub](https://github.com/cwickham/one-notebook-many-reports/blob/main/01-one-notebook/corvallis.ipynb), but here are the key pieces: |
| 46 | + |
| 47 | +- The code cells import some data for all of Oregon, and filter it to just rows relevant for Corvallis, then produce a summary sentence and a plot. |
| 48 | + |
| 49 | +- The document options specify `echo: false` so no code appears in the final output, and `format: typst` so the output is a PDF produced via [Typst](https://typst.app), a modern alternative to LaTeX. |
| 50 | + |
| 51 | +This single notebook can be rendered with Quarto: |
| 52 | + |
| 53 | +```{.bash filename="Terminal"} |
| 54 | +quarto render corvallis.ipynb |
| 55 | +``` |
| 56 | + |
| 57 | +The result is a PDF file, `corvallis.pdf`, a simple report with the title "Corvallis" and a single sentence summary of the climate data, along with a plot highlighting the mean temperature for this year against the last 30 years. |
| 58 | + |
| 59 | +{.column-margin fig-alt="Screenshot of a PDF file with the title 'Corvallis' that contains a single sentence summary and a plot."} |
| 60 | + |
| 61 | +Now, imagine we want to create this report for the 50 largest cities in Oregon. |
| 62 | +Here's the steps we'll take: |
| 63 | + |
| 64 | +1. Turn hardcoded values into variables |
| 65 | +2. Declare those variables parameters |
| 66 | +3. Render the notebook with different parameter values |
| 67 | +4. Automate rendering with many parameter values |
| 68 | + |
| 69 | +### 1. Turn hardcoded values into variables |
| 70 | + |
| 71 | +We want a report for each city. |
| 72 | +We'll start by creating a variable, `city`, which we'll designate a parameter in our next step. |
| 73 | +In a new code cell at the top of our notebook, we define the variable: |
| 74 | + |
| 75 | +````{.python filename="code"} |
| 76 | +city = "Corvallis" |
| 77 | +```` |
| 78 | + |
| 79 | +Then anywhere we previously hardcoded `"Corvallis"` in the notebook, we replace it with this variable. |
| 80 | + |
| 81 | +The first occurrence is in the title of the document. |
| 82 | +Originally, we had a markdown cell defining a level 1 heading: |
| 83 | + |
| 84 | +```{.markdown filename="markdown"} |
| 85 | +# Corvallis |
| 86 | +``` |
| 87 | + |
| 88 | +We replace it with a code cell that uses an f-string to produce markdown for a level 1 heading based on the `city` variable: |
| 89 | + |
| 90 | +```{.markdown filename="code"} |
| 91 | +Markdown(f"# {city}") |
| 92 | +``` |
| 93 | + |
| 94 | +In the filtering step the replacement is straightforward, we just change the string to the variable: |
| 95 | + |
| 96 | +::: {layout-ncol="2"} |
| 97 | + |
| 98 | +:::{} |
| 99 | +Before: |
| 100 | + |
| 101 | +```{.python filename="code"} |
| 102 | +tmean = tmean_oregon.filter( |
| 103 | + pl.col("city") == "Corvallis", |
| 104 | +) |
| 105 | +``` |
| 106 | +::: |
| 107 | + |
| 108 | +:::{} |
| 109 | +After: |
| 110 | + |
| 111 | +```{.python filename="code"} |
| 112 | +tmean = tmean_oregon.filter( |
| 113 | + pl.col("city") == city, |
| 114 | +) |
| 115 | +``` |
| 116 | +::: |
| 117 | +::: |
| 118 | + |
| 119 | +Finally, the plot code (using [plotnine](https://plotnine.org)), sets the title of the plot to include the city name: |
| 120 | + |
| 121 | +```{.python filename="code"} |
| 122 | +... |
| 123 | ++ labs(title = "Corvallis, OR", ...) |
| 124 | +... |
| 125 | +``` |
| 126 | + |
| 127 | +We can also use an f-string here to include the `city` variable: |
| 128 | + |
| 129 | +```{.python filename="code"} |
| 130 | +... |
| 131 | ++ labs(title = f"{city}, OR", ...) |
| 132 | +... |
| 133 | +``` |
| 134 | + |
| 135 | +Now, we should be able to test our changes by explicitly setting the `city` variable to something other than "Corvallis" and re-running the cells. |
| 136 | +Since our report is no longer specific to Corvallis, we can rename it `climate.ipynb`. |
| 137 | + |
| 138 | +### 2. Declare those variables parameters |
| 139 | + |
| 140 | +Now we have a variable that represents the parameter, we need to let Quarto know it's a parameter. |
| 141 | +Quarto's parameterized reports are implemented using [Papermill](https://papermill.readthedocs.io/en/latest/), and inherit Papermill's approach: tag the cell defining the parameter with `parameters`. |
| 142 | + |
| 143 | +In Jupyter, you can add this tag through the cell toolbar: |
| 144 | + |
| 145 | +{fig-alt="Screenshot of a Jupyter notebook cell with a tag 'parameters' added to it."} |
| 146 | + |
| 147 | +You can see the updated notebook, now a parameterized notebook, on GitHub: [`climate.ipynb`](hhttps://github.com/cwickham/one-notebook-many-reports/blob/main/02-one-parameterized-report/climate.ipynb). |
| 148 | + |
| 149 | +### 3. Render with different parameter values |
| 150 | + |
| 151 | +If we render `climate.ipynb`, it will still produce the same report for Corvallis, because we haven't changed the parameter value: |
| 152 | + |
| 153 | +```{.bash filename="Terminal"} |
| 154 | +quarto render climate.ipynb |
| 155 | +``` |
| 156 | + |
| 157 | +But we can now pass parameter values to Quarto with the `-P` flag: |
| 158 | + |
| 159 | +```{.bash filename="Terminal"} |
| 160 | +# Generate report for Portland |
| 161 | +quarto render climate.ipynb -P city:Portland --output-file portland.pdf |
| 162 | + |
| 163 | +# Generate report for Eugene |
| 164 | +quarto render climate.ipynb -P city:Eugene --output-file eugene.pdf |
| 165 | +``` |
| 166 | + |
| 167 | +We've also added `--output-file` to ensure each report gets its own filename. |
| 168 | + |
| 169 | +### 4. Automate rendering with many parameter values |
| 170 | + |
| 171 | +To generate all 50 reports, we need to run `quarto render` 50 times, each time with a different city as the parameter value. |
| 172 | +You could automate this in many ways, but let's use a Python script. |
| 173 | +For example, you might have a dataset of cities and their corresponding output filenames: |
| 174 | + |
| 175 | +```{.python filename="gen-reports.py"} |
| 176 | +cities = pl.DataFrame({ |
| 177 | + "city": ["Portland", "Cottage Grove", "St. Helens", "Eugene"], |
| 178 | + "output_file": ["portland.pdf", "cottage_grove.pdf", "st_helens.pdf", "eugene.pdf"] |
| 179 | +}) |
| 180 | +``` |
| 181 | + |
| 182 | +I've generated a small example above, but in reality you would likely read `cities` in from a file. |
| 183 | +Then you could iterate over the rows of this dataset, rendering the notebook for each city: |
| 184 | + |
| 185 | +```{.python filename="gen-reports.py"} |
| 186 | +from quarto import render |
| 187 | +for row in cities.iter_rows(named=True): |
| 188 | + render( |
| 189 | + "climate.ipynb", |
| 190 | + execute_params={"city": row["city"]}, |
| 191 | + output_file=row["output_file"], |
| 192 | + ) |
| 193 | +``` |
| 194 | + |
| 195 | +Run this script once, and you'll get all 50 custom reports! |
| 196 | + |
| 197 | +You can find the complete working example on GitHub: [cwickham/one-notebook-many-reports/03-many-reports](https://github.com/cwickham/one-notebook-many-reports/tree/main/03-many-reports). |
| 198 | + |
| 199 | +## Pretty Reports: Brand and Typst |
| 200 | + |
| 201 | +The steps above to produce parameterized reports apply to any output format supported by Quarto. |
| 202 | +However, if you are targeting `typst` you can take advantage of additional features to create beautiful PDF reports. |
| 203 | + |
| 204 | +### Brand.yml |
| 205 | + |
| 206 | +Quarto supports [brand.yml](https://posit-dev.github.io/brand-yml/) a way to specify colors, fonts, and logos: |
| 207 | + |
| 208 | +```{.yaml filename="_brand.yml"} |
| 209 | +color: |
| 210 | + palette: |
| 211 | + forest-green: "#2d5a3d" |
| 212 | + charcoal-grey: "#555555" |
| 213 | + foreground: charcoal-grey |
| 214 | + primary: forest-green |
| 215 | +typography: |
| 216 | + fonts: |
| 217 | + - family: Open Sans |
| 218 | + source: google |
| 219 | + base: |
| 220 | + family: Open Sans |
| 221 | +logo: |
| 222 | + medium: logo.png |
| 223 | +``` |
| 224 | + |
| 225 | +Quarto will detect the `_brand.yml` file and apply the colors, fonts and logo to your report. |
| 226 | +Colors and fonts in your figures will need to be customized in your code, but that is made much easier with the [brand-yml](https://posit-dev.github.io/brand-yml/pkg/py/) Python package which imports your values from `_brand.yml`. |
| 227 | + |
| 228 | +You can see a full example of using `_brand.yml` with `climate.ipynb` at [cwickham/one-notebook-many-reports/04-branded-reports](https://github.com/cwickham/scipy-talk/tree/main/04-branded-reports), and learn more about Quarto's support for brand in the [Brand guide](/docs/authoring/brand.qmd). |
| 229 | + |
| 230 | + |
| 231 | +### Typst |
| 232 | + |
| 233 | +Learning a little bit of Typst syntax can take your reports from basic to beautiful. |
| 234 | +You can include [raw Typst syntax](/docs/output-formats/typst.qmd#raw-typst) in your notebooks, or wrap elements in Typst functions using the [typst-function Quarto extension](https://github.com/christopherkenny/typst-function). |
| 235 | +As an example, you could add a header with the city name and a map of the location: |
| 236 | + |
| 237 | +{fig-alt="The `corvallis.ipynb` notebook rendered by Quarto to `pdf`. The document has dark green header with the city in white text and a map next to it with the location as an orange dot." } |
| 238 | + |
| 239 | +You can see the source for this example at [cwickham/one-notebook-many-reports/05-pretty-reports](https://github.com/cwickham/one-notebook-many-reports/tree/main/05-pretty-reports). |
| 240 | + |
| 241 | +## `jupyter` vs `knitr` |
| 242 | + |
| 243 | +The steps for creating a parameterized report above are specific to documents that use the `jupyter` engine. |
| 244 | +With a Jupyter notebook (`.ipynb`), |
| 245 | +or a plain text Quarto document (`.qmd`) with only Python code cells, |
| 246 | +Quarto will default to the `jupyter` engine. |
| 247 | +As described above, the `jupyter` engine uses cell tags to identify parameters. |
| 248 | + |
| 249 | +If you are working in a `.ipynb` file, your IDE will likely provide a way to add these tags through the cell toolbar. |
| 250 | +If you are working in a `.qmd` file, you can add tags as a code cell option: |
| 251 | + |
| 252 | +````markdown |
| 253 | +```{{python}} |
| 254 | +#| tags: [parameters] |
| 255 | +city = "Corvallis" |
| 256 | +``` |
| 257 | +```` |
| 258 | + |
| 259 | +With the `jupyter` engine, parameters can then be accessed directly as variables, e.g. `city`, in later code cells. |
| 260 | + |
| 261 | +If you are working in a Quarto document (`.qmd`) with R code cells, Quarto will default to the `knitr` engine. |
| 262 | +With the `knitr` engine, you set parameters in the document header under `params`: |
| 263 | + |
| 264 | +```yaml |
| 265 | +--- |
| 266 | +params: |
| 267 | + city: "Corvallis" |
| 268 | +--- |
| 269 | +``` |
| 270 | + |
| 271 | +In `knitr`, parameters are accessed as elements of `params`, e.g. `params$city`. |
| 272 | + |
| 273 | +You can read more about setting and using parameters in [Guide > Computations > Parameters](/docs/computations/parameters.html). |
| 274 | + |
| 275 | +## Wrapping Up |
| 276 | + |
| 277 | +Parameterized reports turn one notebook into many customized outputs. |
| 278 | +You've seen the process of going from a notebook with a hardcoded value to a parameterized report that can be rendered with different values. |
| 279 | +You can then automate the rendering in any way you choose to generate dozens of reports at once. |
| 280 | + |
0 commit comments