Skip to content

Commit caf5f8b

Browse files
Merge branch 'gcarreno:main' into main
2 parents 78c47db + 47c0d1c commit caf5f8b

File tree

1 file changed

+53
-38
lines changed

1 file changed

+53
-38
lines changed

README.md

Lines changed: 53 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,9 @@ This is the repository that will coordinate the 1 Billion Row Challenge for Obje
55
The One Billion Row Challenge (1BRC) is a fun exploration of how far modern Object Pascal can be pushed for aggregating one billion rows from a text file.
66
Grab all your threads, reach out to SIMD, or pull any other trick, and create the fastest implementation for solving this task!
77

8+
<p align="center">
89
<img src="img/1brc.png" alt="1BRC" style="display: block; margin-left: auto; margin-right: auto; margin-bottom:1em; width: 50%;">
10+
</p>
911

1012
The text file contains temperature values for a range of weather stations. Each row is one measurement in the format `<string: station name>;<double: measurement>`, with the measurement value having exactly one fractional digit.
1113
The following shows ten rows as an example:
@@ -29,33 +31,16 @@ The task is to write an Object Pascal program which reads the file, calculates t
2931
{Abha=-23.0/18.0/59.2, Abidjan=-16.2/26.0/67.3, Abéché=-10.0/29.4/69.0, Accra=-10.1/26.4/66.4, Addis Ababa=-23.7/16.0/67.0, Adelaide=-27.8/17.3/58.5, ...}
3032
```
3133

32-
## Honour Mentions
33-
34-
I'd like to thank [@paweld](https://github.com/paweld) for taking us from my miserable 20m attempt, to a whopping ~25s, beating the [Python script](https://github.com/gunnarmorling/1brc/blob/main/src/main/python/create_measurements.py) by about 4 and a half minutes.
35-
36-
I'd like to thank [@mobius](https://github.com/mobius1qwe) for taking the time to provide the Delphi version of the generator.
37-
38-
## Links
39-
40-
The original repository: https://github.com/gunnarmorling/1brc
41-
42-
I found out about it by watching this video about an attempt in Go: https://www.youtube.com/watch?v=cYng524S-MA
43-
44-
The blog post in question: https://www.bytesizego.com/blog/one-billion-row-challenge-go
45-
4634
## Entering The Challenge
47-
48-
Submissions will be via a `PR`( Pull Request ) to this repository.
49-
35+
Submissions will be via a `PR`( Pull Request ) to this repository. \
5036
The challenge will run from the 10th of March until the 10th of May, 2024.
5137

5238
When creating your entry, please do as follows:
5339
1. Create a folder under `entries` with your first initial and last name, e.g., for Gustavo Carreno: `entries/gcarreno`.
54-
- If you're worried about anonymity, because the Internet stinks, feel free to use a fictional one:
55-
Bruce Wayne, Clark Kent, James Logan, Peter Parker, Diana of Themyscira. Your pick!
56-
2. Create a `README.md` with some content about your approach, e.g., `entries/gcarreno/README.md`.
57-
3. Put all your code under `entries/<your name>/src`, e.g., `entries/gcarreno/src`.
58-
4. If you need to provide a custom `.gitignore` for something not present in the main one, please do.
40+
2. If you're worried about anonymity, because the Internet stinks, feel free to use a fictional one: Bruce Wayne, Clark Kent, James Logan, Peter Parker, Diana of Themyscira. Your pick!
41+
3. Create a `README.md` with some content about your approach, e.g., `entries/gcarreno/README.md`.
42+
4. Put all your code under `entries/<your name>/src`, e.g., `entries/gcarreno/src`.
43+
5. If you need to provide a custom `.gitignore` for something not present in the main one, please do.
5944

6045
This challenge is mainly to allow us to learn something new. This means that copying code from others will be allowed, under these conditions:
6146
1. You can only use pure Object Pascal with no calls to any operating system's `API` or external `C/C++` libraries.
@@ -64,39 +49,62 @@ This challenge is mainly to allow us to learn something new. This means that cop
6449
4. It adds something of value, not just a different code formatting.
6550
5. All code should be formatted with the `IDE`'s default formatting tool.
6651

67-
In order to produce the One Billion Rows of text, we are providing the source code for the official generator, so we all have the same entry data.
52+
Submit your implementation and become part of the leader board!
6853

69-
> **NOTE**
70-
>
54+
## Generating the measurements.txt
55+
> **NOTE** \
7156
> We now have both a Lazarus version and a Delphi version of the generator for both 32b and 64b.
7257
73-
Submit your implementation and become part of the leader board!
58+
In order to produce the One Billion Rows of text, we are providing the [source code](./generator) for the official generator, so we all have the same entry data.
59+
60+
| Parameter | Description |
61+
|:----------|:------------|
62+
| -h or --help | Writes this help message and exits |
63+
| -v or --version | Writes the version and exits |
64+
| -i or --input-file <filename> | The file containing the Weather Stations |
65+
| -o or --output-file <filename> | The file that will contain the generated lines |
66+
| -n or --line-count <number> | The amount of lines to be generated ( Can use 1_000_000_000 ) |
7467

75-
## Results
7668

69+
### Verify
70+
You can verify the generated `measurements.txt` with a `SHA256` utility:
71+
72+
**Linux**
73+
```sh
74+
$ sha256sum ./data/measurements.txt
75+
```
76+
**Windows (PowerShell)**
77+
```ps
78+
Get-FileHash .\data\measurements.txt -Algorithm SHA256
79+
```
80+
Expected `SHA256` hash:
81+
`ebad17b266ee9f5cb3d118531f197e6f68c9ab988abc5cb9506e6257e1a52ce6`
82+
83+
> **NOTE**
84+
>
85+
> I'm still being lazy and I need to do the baseline in order for us to have the same `SHA256` value for an official output.
86+
87+
## Results
7788
These are the results from running all entries into the challenge on my personal computer:
7889
- Ubuntu 23.10 64b
7990
- Ryzen 9 5950x 16 cores
8091
- 32GB RAM
8192
- 250GB SSD
8293
- 1TB HDD
8394

84-
| # | Result (m:s.ms): SSD | Result (m:s.ms): HDD | Submitter | Notes | Certificates |
85-
|---|----------------------|----------------------|---------------|-----------|--------------|
86-
| 1 | 0:29.212 | 2:2.504 | Székely Balázs | Using 16 threads | |
95+
| # | Result (m:s.ms): SSD | Result (m:s.ms): HDD | Compiler | Submitter | Notes | Certificates |
96+
|--:|---------------------:|---------------------:|:---------|:--------------|:----------|:-------------|
97+
| 1 | 0:29.212 | 2:2.504 | lazarus-3.0, fpc-3.2.2 | Székely Balázs | Using 16 threads | |
8798

8899
## Evaluating Results
89-
90-
Each contender is run 10 times in a row for both `SSD` and `HDD` using `hyperfine` for the time taking.
91-
The mean value of the 10 runs is the result for that contender and will be added to the results table above.
100+
Each contender is run 10 times in a row for both `SSD` and `HDD` using `hyperfine` for the time taking. \
101+
The mean value of the 10 runs is the result for that contender and will be added to the results table above. \
92102
The exact same `measurements.txt` file is used for evaluating all contenders.
93103

94104
## Prize
95-
96105
This is being run for bragging rights only and the fun of such a challenge.
97106

98107
## FAQ
99-
100108
_Q: Can I copy code from other submissions?_\
101109
A: Yes, you can. The primary focus of the challenge is about learning something new, rather than "winning". When you do so, please give credit to the relevant source submissions. Please don't re-submit other entries with no or only trivial improvements.
102110

@@ -106,11 +114,18 @@ A: The file is encoded with UTF-8.
106114
_Q: Which operating system is used for evaluation?_\
107115
A: Ubuntu 23.10.
108116

109-
## License
117+
## Honour Mentions
118+
I'd like to thank [@paweld](https://github.com/paweld) for taking us from my miserable 20m attempt, to a whopping ~25s, beating the [Python script](https://github.com/gunnarmorling/1brc/blob/main/src/main/python/create_measurements.py) by about 4 and a half minutes.\
119+
I'd like to thank [@mobius](https://github.com/mobius1qwe) for taking the time to provide the Delphi version of the generator.
110120

121+
## Links
122+
The original repository: https://github.com/gunnarmorling/1brc \
123+
I found out about it by watching this video about an attempt in Go: https://www.youtube.com/watch?v=cYng524S-MA \
124+
The blog post in question: https://www.bytesizego.com/blog/one-billion-row-challenge-go
125+
126+
## License
111127
This code base is available under the MIT License.
112128

113129
## Code of Conduct
114-
115-
Be excellent to each other!
130+
Be excellent to each other!\
116131
More than winning, the purpose of this challenge is to have fun and learn something new.

0 commit comments

Comments
 (0)