Skip to content

Commit 3b591fb

Browse files
Merge pull request #7 from NSAPH-Data-Processing/TinasheMTapera/issue6
Tinashe m tapera/issue6
2 parents ee2492b + 920f697 commit 3b591fb

33 files changed

+9263
-213
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,3 @@
11
This file will be overwritten by `index.ipynb`
2+
3+
In the meantime, see `notes/index.ipynb` for the notes..

_proc/00_core.ipynb

Lines changed: 0 additions & 74 deletions
This file was deleted.

_proc/index.ipynb

Lines changed: 84 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,39 @@
2424
"cell_type": "markdown",
2525
"metadata": {},
2626
"source": [
27-
"This file will become your README and also the index of your documentation."
27+
"Here we are developing functions and code for the Madagascar ERA5 dataset project. The goal is for exposure data to be made available at the daily resolution when possible. Finer resolutions shouldn’t ever be needed for our purposes, and it should then be relatively easy to aggregate at coarser resolutions, such as weekly or monthly.\n",
28+
"\n",
29+
"Variables should generally be made available from 2010 onward, as that’s where our clinic data starts.\n",
30+
"\n",
31+
"All data are ideally made available at the “healthshed” geographical level. Healthsheds are defined as geographical areas where people who live all go to the same clinic. There are a total of ~2700 public clinics in Madagascar, hence ~2700 healthsheds, with each healthshed containing ~10000 people on average.\n",
32+
"\n",
33+
"Preliminary list of environmental variables\n",
34+
"\n",
35+
"- [ ] 2-m air temperature from ERA5: daily min, max, mean\n",
36+
" \n",
37+
"- [ ] 2-m air dew point temperature from ERA5: daily min, max, mean\n",
38+
"\n",
39+
"- [ ] Precipitation: daily total (ERA5)\n",
40+
"\n",
41+
"- [ ] Sea surface temperature: daily average and maximum in the nearest neighbor for each healthshed.\n",
42+
"\n",
43+
"- [ ] Precipitation: daily total (CHIRPS)\n",
44+
"\n",
45+
"- [ ] Chlorophyll-A (Giacomo)\n",
46+
"\n",
47+
"- [ ] Wealth index: Available from Giacomo \n",
48+
"\n",
49+
"- [ ] NDVI\n",
50+
"\n",
51+
"- [ ] Tropical storm\n",
52+
"\n",
53+
"- [ ] Flooding\n",
54+
"\n",
55+
"- [ ] Deforestation\n",
56+
"\n",
57+
"- [ ] Linking/segmenting healthsheds into climate zones and other \n",
58+
"\n",
59+
"- [ ] Relative humidity: daily average (lower priority)"
2860
]
2961
},
3062
{
@@ -56,12 +88,14 @@
5688
"# make sure era5_sandbox package is installed in development mode\n",
5789
"$ pip install -e .\n",
5890
"\n",
59-
"# make changes under nbs/ directory\n",
60-
"# ...\n",
91+
"# To make changes, go to the \"notes\" directory and edit the notebooks as necessary.\n",
92+
"# Each notebook refers to a module in the era5_sandbox package. Cells are exported to the module\n",
93+
"# when the notebook is saved and you run the following command:\n",
6194
"\n",
62-
"# compile to have changes apply to era5_sandbox\n",
63-
"$ nbdev_prepare\n",
64-
"```"
95+
"$ nbdev_export\n",
96+
"```\n",
97+
"\n",
98+
"For e.g., to change functionality of the [`testAPI()`](https://TinasheMTapera.github.io/era5_sandbox/core.html#testapi) function in the testAPI Hydra rule, you would edit the [`testAPI`](https://TinasheMTapera.github.io/era5_sandbox/core.html#testapi) notebook in the `notes` directory `notes/testAPI.ipynb`, and then save that notebook and run `nbdev_export` to update the `core` module in the package."
6599
]
66100
},
67101
{
@@ -85,27 +119,18 @@
85119
"Install latest from the GitHub [repository][repo]:\n",
86120
"\n",
87121
"```sh\n",
88-
"$ pip install git+https://github.com/TinasheMTapera/era5_sandbox.git\n",
89-
"```\n",
90-
"\n",
91-
"or from [conda][conda]\n",
92-
"\n",
93-
"```sh\n",
94-
"$ conda install -c TinasheMTapera era5_sandbox\n",
122+
"$ pip install git+https://github.com/NSAPH-Data-Processing/era5_sandbox\n",
95123
"```\n",
96124
"\n",
97-
"or from [pypi][pypi]\n",
98-
"\n",
125+
"or clone and install in development mode:\n",
99126
"\n",
100127
"```sh\n",
101-
"$ pip install era5_sandbox\n",
128+
"$ git clone https://github.com/NSAPH-Data-Processing/era5_sandbox\n",
129+
"$ pip install -e .\n",
102130
"```\n",
103131
"\n",
104132
"\n",
105-
"[repo]: https://github.com/TinasheMTapera/era5_sandbox\n",
106-
"[docs]: https://TinasheMTapera.github.io/era5_sandbox/\n",
107-
"[pypi]: https://pypi.org/project/era5_sandbox/\n",
108-
"[conda]: https://anaconda.org/TinasheMTapera/era5_sandbox"
133+
"[repo]: https://github.com/NSAPH-Data-Processing/era5_sandbox"
109134
]
110135
},
111136
{
@@ -119,12 +144,7 @@
119144
"cell_type": "markdown",
120145
"metadata": {},
121146
"source": [
122-
"Documentation can be found hosted on this GitHub [repository][repo]'s [pages][docs]. Additionally you can find package manager specific guidelines on [conda][conda] and [pypi][pypi] respectively.\n",
123-
"\n",
124-
"[repo]: https://github.com/TinasheMTapera/era5_sandbox\n",
125-
"[docs]: https://TinasheMTapera.github.io/era5_sandbox/\n",
126-
"[pypi]: https://pypi.org/project/era5_sandbox/\n",
127-
"[conda]: https://anaconda.org/TinasheMTapera/era5_sandbox"
147+
"🚧Documentation is in development 🚧"
128148
]
129149
},
130150
{
@@ -138,7 +158,43 @@
138158
"cell_type": "markdown",
139159
"metadata": {},
140160
"source": [
141-
"Fill me in please! Don't forget code examples:"
161+
"The pipeline currently downloads ERA5 temperature and dew point temperature data for a given date range and geographical bounding box. You can learn each of these steps by following the notebooks in `notes` in numerical order.\n",
162+
"\n",
163+
"To run the pipeline, the config at `config/config.yaml` should be updated with the desired date range and geographical bounding box. The pipeline can then be run with the following command:\n",
164+
"\n",
165+
"```sh\n",
166+
"sbatch snakemake.sbatch\n",
167+
"```\n",
168+
"\n",
169+
"You can investigate the downloaded raw data with python, eg.:\n",
170+
"\n",
171+
"```python\n",
172+
"import xarray as xr\n",
173+
"import matplotlib.pyplot as plt\n",
174+
"import cartopy.crs as ccrs\n",
175+
"import cartopy.feature as cfeature\n",
176+
"\n",
177+
"### the path to any of the downloaded files\n",
178+
"file_path = \"/n/dominici_lab/lab/data_processing/csph-era5_sandbox/data/input/2010_01.nc\"\n",
179+
"data = xr.open_dataset(file_path)\n",
180+
"\n",
181+
"\n",
182+
"temperature = data[\"t2m\"]\n",
183+
"\n",
184+
"\n",
185+
"\n",
186+
"# Select a specific time step\n",
187+
"temperature_at_time = temperature.isel(valid_time=0)\n",
188+
"\n",
189+
"# Plot the data on a map\n",
190+
"plt.figure(figsize=(12, 8))\n",
191+
"ax = plt.axes(projection=ccrs.PlateCarree())\n",
192+
"temperature_at_time.plot(ax=ax, cmap=\"coolwarm\", transform=ccrs.PlateCarree(), cbar_kwargs={\"label\": \"Temperature (K)\"})\n",
193+
"ax.coastlines()\n",
194+
"ax.add_feature(cfeature.BORDERS, linestyle=\":\")\n",
195+
"ax.set_title(\"Temperature at Time Step 0\")\n",
196+
"plt.show()\n",
197+
"```"
142198
]
143199
},
144200
{
@@ -175,28 +231,9 @@
175231
],
176232
"metadata": {
177233
"kernelspec": {
178-
"display_name": "Python 3 (ipykernel)",
234+
"display_name": "python3",
179235
"language": "python",
180236
"name": "python3"
181-
},
182-
"language_info": {
183-
"codemirror_mode": {
184-
"name": "ipython",
185-
"version": 3
186-
},
187-
"file_extension": ".py",
188-
"mimetype": "text/x-python",
189-
"name": "python",
190-
"nbconvert_exporter": "python",
191-
"pygments_lexer": "ipython3",
192-
"version": "3.11.8"
193-
},
194-
"widgets": {
195-
"application/vnd.jupyter.widget-state+json": {
196-
"state": {},
197-
"version_major": 2,
198-
"version_minor": 0
199-
}
200237
}
201238
},
202239
"nbformat": 4,

_proc/sidebar.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
website:
2+
sidebar:
3+
contents:
4+
- index.ipynb
5+
- 00_core.ipynb
6+
- 01_download_raw_data.ipynb

conf/aggregation/aggregation.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
daily:
2+
function: "numpy.mean"
3+
string: mean
4+
5+
monthly:
6+
function: "numpy.mean"
7+
string: mean
8+
9+
variable: ['t2m', 'd2m']

conf/config.yaml

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,27 @@
11
defaults:
22
- _self_
33
- datapaths: datapaths
4+
- aggregation: aggregation
45

56
development_mode: false
67

78
CDS_API_KEY:
89
path: "$HOME/.cdsapirc"
910

10-
gadm_file: "https://geodata.ucdavis.edu/gadm/gadm4.1/gpkg/gadm41_MDG.gpkg"
11+
GOOGLE_DRIVE_AUTH_JSON:
12+
path: "sandbox/harvard-csph-driveauth-f5f9a2682ecf.json"
13+
healthsheds_id: "healthsheds2022.zip"
14+
15+
mdg_shapefile: "https://data.humdata.org/dataset/26fa506b-0727-4d9d-a590-d2abee21ee22/resource/ed94d52e-349e-41be-80cb-62dc0435bd34/download/mdg_adm_bngrc_ocha_20181031_shp.zip"
16+
17+
dataset: "reanalysis-era5-single-levels"
1118

1219
query:
1320
product_type: reanalysis
1421
# check precipitation
1522
# variable: ["2m_dewpoint_temperature", "2m_temperature", "skin_temperature", "total_precipitation"]
1623
variable: ["2m_dewpoint_temperature", "2m_temperature"]
17-
year: [2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024]
24+
year: [2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024]
1825
month: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
1926
day: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]
2027
time: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]

0 commit comments

Comments
 (0)