You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: entries/abouchez/README.md
+13-7Lines changed: 13 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -115,16 +115,22 @@ So we first need to find out which options leverage at best the hardware it runs
115
115
On the https://github.com/gcarreno/1brc-ObjectPascal challenge hardware, which is a Ryzen 9 5950x with 16 cores / 32 threads and 64MB of L3 cache, each thread using around 2.5MB of its own data, we should try several options with 16-24-32 threads, for instance:
116
116
117
117
```
118
-
./abouchez measurements.txt -v -t=8
119
-
./abouchez measurements.txt -v -t=16
120
-
./abouchez measurements.txt -v -t=24
121
-
./abouchez measurements.txt -v -t=32
122
-
./abouchez measurements.txt -v -t=16 -a
123
-
./abouchez measurements.txt -v -t=24 -a
124
-
./abouchez measurements.txt -v -t=32 -a
118
+
time ./abouchez measurements.txt -v -t=8
119
+
time ./abouchez measurements.txt -v -t=16
120
+
time ./abouchez measurements.txt -v -t=24
121
+
time ./abouchez measurements.txt -v -t=32
122
+
time ./abouchez measurements.txt -v -t=16 -a
123
+
time ./abouchez measurements.txt -v -t=24 -a
124
+
time ./abouchez measurements.txt -v -t=32 -a
125
125
```
126
126
Please run those command lines, to guess which parameters are to be run for the benchmark, and would give the best results on the actual benchmark PC with its Ryzen 9 CPU. We will see if core affinity makes a difference here.
127
127
128
+
Then we could run:
129
+
```
130
+
time ./abouchez measurements.txt -v -t=1
131
+
```
132
+
This `-t=1` run is for fun: it will run the process in a single thread. It will help to guess how optimized (and lockfree) our parsing code is, and to validate the CPU multi-core abilities. In a perfect world, other `-t=##` runs should stand for a perfect division of `real` time per the number of working threads, and the `user` value reported by `time` should remain almost the same when we add threads up to the number of CPU cores.
133
+
128
134
## Feedback Needed
129
135
130
136
Here we will put some additional information, once our proposal has been run on the benchmark hardware.
0 commit comments