@@ -171,9 +171,9 @@ Users can also install several drivers at once:
171171Usage: ` data-diff DB1_URI TABLE1_NAME DB2_URI TABLE2_NAME [OPTIONS] `
172172
173173See the [ example command] ( #example-command-and-output ) and the [ sample
174- connection strings] ( #supported-databases ) .
174+ connection strings] ( #supported-databases ) .
175175
176- Note that for some databases, the arguments that you enter in the command line
176+ Note that for some databases, the arguments that you enter in the command line
177177may be case-sensitive. This is the case for the Snowflake schema and table names.
178178
179179Options:
@@ -423,11 +423,16 @@ $ docker-compose up -d mysql postgres # run mysql and postgres dbs in background
423423
424424** 3. Run Unit Tests**
425425
426+ There are more than 1000 tests for all the different type and database
427+ combinations, so we recommend using ` unittest-parallel ` that's installed as a
428+ development dependency.
429+
426430``` shell-session
427- $ poetry run python3 -m unittest
431+ $ poetry run unittest-parallel -j 16 # run all tests
432+ $ poetry run python -m unittest -k <test> # run individual test
428433```
429434
430- ** 4. Seed the Database(s)**
435+ ** 4. Seed the Database(s) (optional) **
431436
432437First, download the CSVs of seeding data:
433438
@@ -451,7 +456,7 @@ $ poetry run preql -f dev/prepare_db.pql mssql://<uri>
451456$ poetry run preql -f dev/prepare_db.pql bigquery:///<project>
452457```
453458
454- ** 5. Run ** data-diff** against seeded database**
459+ ** 5. Run ** data-diff** against seeded database (optional) **
455460
456461``` bash
457462poetry run python3 -m data_diff postgresql://postgres:Password1@localhost/postgres rating postgresql://postgres:Password1@localhost/postgres rating_del1 --verbose
@@ -460,7 +465,14 @@ poetry run python3 -m data_diff postgresql://postgres:Password1@localhost/postgr
460465** 6. Run benchmarks (optional)**
461466
462467``` shell-session
463- $ dev/benchmark.sh
468+ $ dev/benchmark.sh # runs benchmarks and puts results in benchmark_<sha>.csv
469+ $ poetry run python3 dev/graph.py # create graphs from benchmark_*.csv files
470+ ```
471+
472+ You can adjust how many rows we benchmark with by passing ` N_SAMPLES ` to ` dev/benchmark.sh ` :
473+
474+ ``` shell-session
475+ $ N_SAMPLES=100000000 dev/benchmark.sh # 100m which is our canonical target
464476```
465477
466478
0 commit comments