Skip to content

Commit 2931c7a

Browse files
committed
Update README.md
1 parent 242598e commit 2931c7a

File tree

1 file changed

+35
-8
lines changed

1 file changed

+35
-8
lines changed

README.md

Lines changed: 35 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,17 @@ SPDX-License-Identifier: CC-BY-4.0
55

66
# Snakemake Storage Plugin: Zenodo
77

8-
A Snakemake storage plugin for downloading files from Zenodo with local caching and intelligent rate limiting.
8+
A Snakemake storage plugin for downloading files from Zenodo with local caching, checksum verification, and adaptive rate limiting.
99

1010
## Features
1111

12-
- **Local caching**: Downloads are cached to avoid redundant transfers
13-
- **Rate limit handling**: Automatically respects Zenodo's rate limits using `X-RateLimit-*` headers
12+
- **Local caching**: Downloads are cached to avoid redundant transfers (can be disabled)
13+
- **Checksum verification**: Automatically verifies MD5 checksums from Zenodo API
14+
- **Rate limit handling**: Automatically respects Zenodo's rate limits using `X-RateLimit-*` headers with exponential backoff retry
1415
- **Concurrent download control**: Limits simultaneous downloads to prevent overwhelming Zenodo
1516
- **Progress bars**: Shows download progress with tqdm
1617
- **Immutable URLs**: Returns mtime=0 since Zenodo URLs are persistent
18+
- **Environment variable support**: Configure via environment variables for CI/CD workflows
1719

1820
## Installation
1921

@@ -43,12 +45,20 @@ If you don't explicitly configure it, the plugin will use default settings autom
4345
### Settings
4446

4547
- **cache** (optional): Cache directory for downloaded files
46-
- Default: `~/.cache/snakemake/pypsaeur`
48+
- Default: Platform-dependent user cache directory (via `platformdirs.user_cache_dir("snakemake-pypsa-eur")`)
49+
- Set to `""` (empty string) to disable caching
4750
- Files are cached here to avoid re-downloading
51+
- Environment variable: `SNAKEMAKE_STORAGE_ZENODO_CACHE`
52+
53+
- **skip_remote_checks** (optional): Skip metadata checking with Zenodo API
54+
- Default: `False` (perform checks)
55+
- Set to `True` or `"1"` to skip remote existence/size checks (useful for CI/CD)
56+
- Environment variable: `SNAKEMAKE_STORAGE_ZENODO_SKIP_REMOTE_CHECKS`
4857

4958
- **max_concurrent_downloads** (optional): Maximum concurrent downloads
5059
- Default: `3`
5160
- Controls how many Zenodo files can be downloaded simultaneously
61+
- No environment variable support
5262

5363
## Usage
5464

@@ -79,15 +89,30 @@ rule download_data:
7989
```
8090

8191
The plugin will:
82-
1. Check if the file exists in the cache
92+
1. Check if the file exists in the cache (if caching is enabled)
8393
2. If cached, copy from cache (fast)
8494
3. If not cached, download from Zenodo with:
8595
- Progress bar showing download status
86-
- Automatic rate limit handling
96+
- Automatic rate limit handling with exponential backoff retry
8797
- Concurrent download limiting
88-
4. Store in cache for future use
98+
- MD5 checksum verification against Zenodo API metadata
99+
4. Store in cache for future use (if caching is enabled)
100+
101+
### Example: CI/CD Configuration
102+
103+
For continuous integration environments where you want to skip caching and remote checks:
104+
105+
```yaml
106+
# GitHub Actions example
107+
- name: Run snakemake workflows
108+
env:
109+
SNAKEMAKE_STORAGE_ZENODO_CACHE: ""
110+
SNAKEMAKE_STORAGE_ZENODO_SKIP_REMOTE_CHECKS: "1"
111+
run: |
112+
snakemake --cores all
113+
```
89114
90-
## Rate Limiting
115+
## Rate Limiting and Retry
91116
92117
Zenodo API limits:
93118
- **Guest users**: 60 requests/minute
@@ -97,6 +122,8 @@ The plugin automatically:
97122
- Monitors `X-RateLimit-Remaining` header
98123
- Waits when rate limit is reached
99124
- Uses `X-RateLimit-Reset` to calculate wait time
125+
- Retries failed requests with exponential backoff (up to 5 attempts)
126+
- Handles transient errors: HTTP errors, timeouts, checksum mismatches, and network issues
100127

101128
## URL Handling
102129

0 commit comments

Comments
 (0)