You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: entries/ghatem-fpc/README.md
+14Lines changed: 14 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -247,3 +247,17 @@ Another trial with various hash functions, a simple modulus vs. a slightly more
247
247
Can be tested with the HASHMULT build option
248
248
249
249
Finally, it seems choosing a dictionary size that is a prime number is also recommended: shaves 1 second out of 20 on my PC.
250
+
251
+
## v.6 (2024-05-04)
252
+
253
+
As of the latest results executed by Paweld, there are two main bottlenecks throttling the entire implementation, according to CallGrind and KCacheGrind:
254
+
- function ExtractLineData, 23% of total cost, of which 9% is due to `fpc_stackcheck`
255
+
- the hash lookup function, at 40% of total cost
256
+
257
+
Currently, the hash lookup is done on an array of records. Increasing the array size causes slowness, and reducing it causes further collisions.
258
+
Will try to see how to reduce collisions (increase array size), all while minimizing the cost of cache misses.
259
+
260
+
For the ExtractLineData, three ideas to try implementing:
261
+
- avoid using a function, to get rid of the cost of stack checking
262
+
- reduce branching
263
+
- unroll the loop (although I had tried this in the past, did not show any improvements)
0 commit comments