Skip to content

Commit 19abc56

Browse files
committed
test: add initial implementation for locust (load test py)
1 parent 2662b12 commit 19abc56

File tree

9 files changed

+1905
-0
lines changed

9 files changed

+1905
-0
lines changed

tests/load-tests/.gitignore

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Environment files with credentials
2+
.env
3+
4+
# Python cache
5+
__pycache__/
6+
*.py[cod]
7+
*$py.class
8+
9+
# Virtual environment
10+
.venv/
11+
12+
# Locust reports
13+
*.html
14+
*_stats.csv
15+
*_stats_history.csv
16+
*_failures.csv
17+
*_exceptions.csv
18+
19+
# Test data backups
20+
test_data.py.bak

tests/load-tests/.python-version

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
3.12

tests/load-tests/README.md

Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
# Locust Load Testing for Cardano Rosetta API
2+
3+
This directory contains a **prototype** Locust-based load testing setup for comparing against the existing Apache Bench (`ab`) stability tests.
4+
5+
## 🎯 Purpose (Spike Investigation)
6+
7+
This is a **spike** to evaluate whether Locust can provide better insights than Apache Bench by:
8+
9+
1. **Varying data per request** - avoiding database caching bias
10+
2. **Categorizing data** - revealing performance patterns (light/medium/heavy loads)
11+
3. **Tracking metrics by category** - identifying which data types are slow
12+
4. **Providing richer metrics** - p95/p99, real-time UI, per-endpoint breakdown
13+
14+
## 🏗️ Architecture
15+
16+
```
17+
tests/load-tests/
18+
├── pyproject.toml # uv dependencies (locust, python-dotenv)
19+
├── locustfile.py # Main load test with all 7 endpoints
20+
├── test_data.py # Categorized test data (light/medium/heavy)
21+
└── README.md # This file
22+
```
23+
24+
## 📦 Setup
25+
26+
```bash
27+
cd tests/load-tests
28+
29+
# Install dependencies with uv
30+
uv sync
31+
32+
# Activate virtual environment
33+
source .venv/bin/activate
34+
```
35+
36+
## 🗄️ Populate Test Data
37+
38+
Before running load tests, you need to populate `test_data.py` with real preprod data.
39+
40+
### Step 1: Port-forward the Yaci Store Database
41+
42+
```bash
43+
# SSH into preview machine and forward PostgreSQL port
44+
ssh -L 5432:localhost:5432 preview
45+
```
46+
47+
### Step 2: Configure Database Connection
48+
49+
```bash
50+
# Copy example environment file
51+
cp .env.example .env
52+
53+
# Edit .env with your database credentials
54+
# DB_HOST=localhost
55+
# DB_PORT=5432
56+
# DB_NAME=preprod
57+
# DB_USER=postgres
58+
# DB_PASSWORD=your_password
59+
```
60+
61+
### Step 3: Run the Population Script
62+
63+
```bash
64+
uv run python populate_test_data.py
65+
```
66+
67+
This will:
68+
- Query Yaci Store database for diverse addresses/blocks/transactions
69+
- Categorize by performance characteristics (light/medium/heavy)
70+
- Generate `test_data.py` with real preprod data
71+
72+
**Expected output:**
73+
```
74+
📍 Querying addresses...
75+
✓ Light addresses (1-10 UTXOs): 10
76+
✓ Medium addresses (100-1K UTXOs): 10
77+
✓ Heavy addresses (10K+ UTXOs): 10
78+
79+
🧱 Querying blocks...
80+
✓ Light blocks (1-5 txs): 5
81+
✓ Heavy blocks (100+ txs): 5
82+
83+
📄 Querying transactions...
84+
✓ Small transactions (<500 bytes): 10
85+
✓ Large transactions (>10KB): 10
86+
```
87+
88+
## 🚀 Usage
89+
90+
### 1. Port-forward Preprod Rosetta Instance
91+
92+
```bash
93+
# SSH into preview server and forward port 8082
94+
ssh -L 8082:localhost:8082 preview
95+
```
96+
97+
### 2. Run Locust
98+
99+
**Web UI Mode (Recommended for exploration):**
100+
```bash
101+
uv run locust --host=http://localhost:8082
102+
```
103+
104+
Then open http://localhost:8089 in your browser to:
105+
- Set number of users
106+
- Set spawn rate
107+
- Monitor real-time metrics
108+
- View charts and breakdowns
109+
110+
**Headless Mode (CI/CD friendly):**
111+
```bash
112+
uv run locust --host=http://localhost:8082 \
113+
--users 50 \
114+
--spawn-rate 5 \
115+
--run-time 300s \
116+
--headless
117+
```
118+
119+
**Generate HTML Report:**
120+
```bash
121+
uv run locust --host=http://localhost:8082 \
122+
--users 50 \
123+
--spawn-rate 5 \
124+
--run-time 300s \
125+
--headless \
126+
--html=report.html \
127+
--csv=results
128+
```
129+
130+
This creates:
131+
- `report.html` - Interactive HTML report
132+
- `results_stats.csv` - Request statistics
133+
- `results_stats_history.csv` - Time-series data
134+
- `results_failures.csv` - Failure details
135+
136+
## 📊 Metrics Provided
137+
138+
Locust provides these metrics **out of the box**:
139+
140+
- **Response time percentiles**: p50, p66, p75, p80, p90, p95, p99
141+
- **Throughput**: Requests per second (RPS)
142+
- **Failure rate**: Count and percentage
143+
- **Per-endpoint breakdown**: All metrics split by endpoint
144+
- **Per-category breakdown**: Metrics split by data category (light/medium/heavy)
145+
146+
### Example Output:
147+
148+
```
149+
/account/balance [light]
150+
Requests: 7000
151+
Failures: 0
152+
Avg: 45.23ms
153+
p95: 89.12ms
154+
p99: 123.45ms
155+
156+
/account/balance [heavy]
157+
Requests: 1000
158+
Failures: 0
159+
Avg: 456.78ms
160+
p95: 890.12ms
161+
p99: 1234.56ms
162+
```
163+
164+
This clearly shows that **heavy addresses (10K+ UTXOs) are ~10x slower** - something that Apache Bench's identical payloads would miss!
165+
166+
## 🎭 Comparison with Apache Bench
167+
168+
| Feature | Apache Bench | Locust |
169+
|---------|-------------|---------|
170+
| **Data variation** | ❌ Identical payload | ✅ Categorized data |
171+
| **Cache bias** | ❌ Heavy caching | ✅ Avoids caching |
172+
| **Percentiles** | ✅ p95, p99 | ✅ p50-p99 |
173+
| **Real-time UI** | ❌ CLI only | ✅ Web UI |
174+
| **Endpoint weights** | ❌ Manual | ✅ Task decorators |
175+
| **Category tracking** | ❌ Not possible | ✅ Built-in |
176+
| **CI/CD** | ✅ Scriptable | ✅ Headless mode |
177+
| **Reports** | 📊 Text output | 📊 HTML + CSV |
178+
179+
## 📝 Test Data Structure
180+
181+
Data is organized in `test_data.py` by **categories**:
182+
183+
```python
184+
ADDRESSES = {
185+
"light": [...], # 1-10 UTXOs (fast)
186+
"medium": [...], # 100-1K UTXOs (moderate)
187+
"heavy": [...] # 10K+ UTXOs (slow)
188+
}
189+
190+
BLOCKS = {
191+
"light": [...], # 1-5 transactions
192+
"heavy": [...] # 100+ transactions
193+
}
194+
195+
TRANSACTIONS = {
196+
"small": [...], # <500 bytes
197+
"large": [...] # >10KB
198+
}
199+
```
200+
201+
**Weights** control distribution:
202+
```python
203+
CATEGORY_WEIGHTS = {
204+
"address_light": 0.7, # 70% of requests
205+
"address_heavy": 0.1, # 10% of requests
206+
}
207+
```
208+
209+
## 🔧 Next Steps (TODOs)
210+
211+
- [ ] **Populate test_data.py** with actual preprod addresses/blocks/transactions
212+
- Query preprod Rosetta to find addresses with varying UTXO counts
213+
- Identify light vs heavy blocks
214+
- Categorize transactions by size
215+
- [ ] **Run comparison test**: ab vs Locust with identical vs varied data
216+
- [ ] **Document findings**: metrics differences, insights, recommendations
217+
- [ ] **Decide**: Full migration? Hybrid approach? Keep current ab tests?
218+
219+
## 🎯 Success Criteria
220+
221+
This spike is successful if:
222+
223+
1. ✅ Locust can vary data per request
224+
2. ✅ Metrics reveal performance degradation patterns by category
225+
3. ✅ p95/p99 metrics match or exceed ab capabilities
226+
4. ✅ CI/CD integration path is clear
227+
5. ⏳ Comparison shows meaningful differences vs ab
228+
229+
## 🚫 Out of Scope (for this spike)
230+
231+
- Full implementation (prototype only)
232+
- Grafana/monitoring integration
233+
- Automated CSV generation
234+
- Full endpoint coverage (7 endpoints is enough for spike)
235+
236+
## 📚 Resources
237+
238+
- [Locust Documentation](https://docs.locust.io/)
239+
- [Task #638: Replace ab with Locust](https://github.com/cardano-foundation/cardano-rosetta-java/issues/638)
240+
- Existing ab tests: `../../load-tests/stability_test.py`

0 commit comments

Comments
 (0)