We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 8bffb5d commit 7526a90Copy full SHA for 7526a90
benchmarks/summary.md
@@ -22,6 +22,8 @@ Date | Device | dtype | batch size | cache length |max input length |max output
22
----| ------- | ------ |---------- | -------------|-----------------|------------------|----------------------
23
2024-05-14 | TPU v5e-8 | bfloat16 | 512 | 2048 | 1024 | 1024 | 8700
24
2024-05-14 | TPU v5e-8 | int8 | 1024 | 2048 | 1024 | 1024 | 8746
25
+2024-06-13 | TPU v5e-1 | bfloat16 | 1024 | 2048 | 1024 | 1024 | 4249
26
+
27
28
** NOTE: ** Gemma 2B uses `--shard_on_batch` flag so it's data parallel instead
29
of model parallel.
0 commit comments