Skip to content

Commit e6d3619

Browse files
author
Arnaud Bouchez
committed
now I hope we are OK with the README format
1 parent da737f4 commit e6d3619

File tree

1 file changed

+7
-5
lines changed

1 file changed

+7
-5
lines changed

entries/abouchez/README.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,14 @@
1-
# mORMot version of The One Billion Row Challenge by Arnaud Bouchez
1+
# Arnaud Bouchez
2+
3+
**mORMot entry to The One Billion Row Challenge in Object Pascal.**
24

35
## mORMot 2 is Required
46

57
This entry requires the **mORMot 2** package to compile.
68

79
Download it from https://github.com/synopse/mORMot2
810

9-
It is better to fork the current state of the mORMot 2 repository, or get the latest release.
11+
It is better to fork the current state of the *mORMot 2* repository, or get the latest release.
1012

1113
## Licence Terms
1214

@@ -29,15 +31,15 @@ Here are the main ideas behind this implementation proposal:
2931
- Parse temperatures with a dedicated code (expects single decimal input values);
3032
- No memory allocation (e.g. no transient `string` or `TBytes`) nor any syscall is done during the parsing process to reduce contention and ensure the process is only CPU-bound and RAM-bound (we checked this with `strace` on Linux);
3133
- Pascal code was tuned to generate the best possible asm output on FPC x86_64 (which is our target);
32-
- Some dedicated x86_64 asm has been written to replace mORMot `crc32c` and `MemCmp` general-purpose functions and gain a last few percents (nice to have);
34+
- Some dedicated x86_64 asm has been written to replace *mORMot* `crc32c` and `MemCmp` general-purpose functions and gain a last few percents (nice to have);
3335
- Can optionally output timing statistics and hash value on the console to debug and refine settings (with the `-v` command line switch);
3436
- Can optionally set each thread affinity to a single core (with the `-a` command line switch).
3537

3638
The "64 bytes cache line" trick is quite unique among all implementations of the "1brc" I have seen in any language - and it does make a noticeable difference in performance. The L1 cache is well known to be the main bottleneck for any efficient in-memory process. We are very lucky the station names are just big enough to fill no more than 64 bytes, with min/max values reduced as 16-bit smallint - resulting in temperature range of -3276.7..+3276.8 which seems fair on our planet according to the IPCC. ;)
3739

3840
## Usage
3941

40-
If you execute the `abouchez` executable without any parameter, it will give you some hints about its usage (using mORMot `TCommandLine` abilities):
42+
If you execute the `abouchez` executable without any parameter, it will give you some hints about its usage (using *mORMot* `TCommandLine` abilities):
4143

4244
```
4345
ab@dev:~/dev/github/1brc-ObjectPascal/bin$ ./abouchez
@@ -139,6 +141,6 @@ Stay tuned!
139141

140142
## Ending Note
141143

142-
There is a "pure mORMot" name lookup version available if you undefine the `CUSTOMHASH` conditional, which is around 40% slower, because it needs to copy the name into the stack before using `TDynArrayHashed`, and has a little more overhead.
144+
There is a "*pure mORMot*" name lookup version available if you undefine the `CUSTOMHASH` conditional, which is around 40% slower, because it needs to copy the name into the stack before using `TDynArrayHashed`, and has a little more overhead.
143145

144146
Arnaud :D

0 commit comments

Comments
 (0)