Skip to content

Commit bfbfe92

Browse files
authored
Merge pull request #2291 from Jamesbarford/chore/update-schema.md
Update schema.md
2 parents c8b6fc1 + b0a4241 commit bfbfe92

File tree

1 file changed

+76
-104
lines changed

1 file changed

+76
-104
lines changed

database/schema.md

Lines changed: 76 additions & 104 deletions
Original file line numberDiff line numberDiff line change
@@ -50,33 +50,22 @@ For runtime benchmarks the schema very similar, but there are different table na
5050

5151
A description of a rustc compiler artifact being benchmarked.
5252

53-
This description includes:
54-
* name: usually a commit sha or a tag like "1.51.0" but is free-form text so can be anything.
55-
* date: the date associated with this compiler artifact (usually only when the name is a commit)
56-
* type: currently one of "master" (i.e., we're testing a merge commit), "try" (someone is testing a PR), and "release" (usually a release candidate - though local compilers also get labeled like this).
53+
Columns:
5754

58-
```
59-
sqlite> select * from artifact limit 1;
60-
id name date type
61-
---------- ---------- ---------- -------
62-
1 LOCAL_TEST release
63-
```
55+
* **name** (`text`): Usually a commit sha "fefce3cecd69cebf2d7c9aa3dd90a84379f4201a" or a tag like "1.51.0" but is free-form text so could be anything.
56+
* **date** (`timestamptz`: The date associated with this compiler artifact (usually only when the name is a commit)
57+
* **type** (`text`): Currently one of "master" (i.e., we're testing a merge commit), "try" (someone is testing a PR), and "release" (usually a release candidate - though local compilers also get labeled like this).
6458

6559
### artifact size
6660

6761
Records the size of individual components (like `librustc_driver.so` or `libLLVM.so`) of a single
6862
artifact.
6963

70-
This description includes:
71-
* component: normalized name of the component (hashes are removed)
72-
* size: size of the component in bytes
64+
Columns:
7365

74-
```
75-
sqlite> select * from artifact_size limit 1;
76-
aid component size
77-
---------- ---------- ----------
78-
1 libLLVM.so 177892352
79-
```
66+
* **aid** (`integer`): Artifact id, references the id in the artifact table
67+
* **component** (`text`): Normalized name of the component (hashes are removed)
68+
* **size** (`integer`): Size of the component in bytes
8069

8170
### collection
8271

@@ -88,34 +77,31 @@ This is a way to collect statistics together signifying that they belong to the
8877

8978
Currently, the collection also marks the git sha of the currently running collector binary.
9079

91-
```
92-
sqlite> select * from collection limit 1;
93-
id perf_commit
94-
---------- ----------------------------------------
95-
1 d9fd96f409a15429757030f225b082744a72516c
96-
```
80+
Columns:
81+
82+
* **id** (`integer`): Unique identifier
83+
* **perf_commit** (`text`): Commit sha / tag
9784

9885
### collector_progress
9986

10087
Keeps track of the collector's start and finish time as well as which step it's currently on.
10188

102-
```
103-
sqlite> select * from collector_progress limit 1;
104-
aid step start end
105-
---------- ---------- ---------- ----------
106-
1 helloworld 1625829961 1625829965
107-
```
89+
Columns:
90+
91+
* **aid** (`integer`): Artifact id, references the id in the artifact table
92+
* **step** (`text`): The step the collector is currently benchmarking
93+
* **start** (`timestamptz`): When the collector started
94+
* **end** (`timestamptz`): When the collector finished
10895

10996
### artifact_collection_duration
11097

11198
Records how long benchmarking takes in seconds.
11299

113-
```
114-
sqlite> select * from artifact_collection_duration limit 1;
115-
aid date_recorded duration
116-
---------- ------------- ----------
117-
1 1625829965 4
118-
```
100+
Columns:
101+
102+
* **aid** (`integer`): Artifact id, references the id in the artifact table
103+
* **date_recorded** (`timestamptz`): When this was recorded
104+
* **duration** (`integer`): How long the benchmarking took in seconds
119105

120106
### benchmark
121107

@@ -127,35 +113,28 @@ and its category. The benchmark name is used as a foreign key in many of the oth
127113
Category is either `primary` (real-world benchmark) or `secondary` (stress test).
128114
Stable benchmarks have `category` set to `primary` and `stabilized` set to `1`.
129115

130-
```
131-
sqlite> select * from benchmark limit 1;
132-
name stabilized category
133-
---------- ---------- ----------
134-
helloworld 0 primary
135-
```
116+
Columns:
117+
118+
* **name** (`text`): Name of the benchmark
119+
* **stabilized** (`boolean`): Whether the benchmark supports stable
120+
* **category** (`category`): `primary` if this is a 'real-world' benchmark or `secondary` if a 'stress test'
136121

137122
### pstat_series
138123

139124
Describes the parametrization of a compile-time benchmark. Contains a unique combination
140125
of a crate, profile, scenario and the metric being collected.
141126

142-
* crate (aka `benchmark`): the benchmarked crate which might be a crate from crates.io or a crate made specifically to stress some part of the compiler.
143-
* profile: what type of compilation is happening - check build, optimized build (a.k.a. release build), debug build, or doc build.
144-
* scenario: describes how much of the incremental cache is full. An empty incremental cache means that the compiler must do a full build.
145-
* backend: codegen backend used for compilation.
146-
* metric: the type of metric being collected.
127+
Columns:
147128

148-
This corresponds to a [`statistic description`](../docs/glossary.md).
129+
* **crate** (`text`) (aka `benchmark`): The benchmarked crate which might be a crate from crates.io or a crate made specifically to stress some part of the compiler.
130+
* **profile** (`text`): What type of compilation is happening - check build, optimized build (a.k.a. release build), debug build, or doc build.
131+
* **scenario** (`text`): Describes how much of the incremental cache is full. An empty incremental cache means that the compiler must do a full build.
132+
* **backend** (`text`): Codegen backend used for compilation, for example 'llvm'
133+
* **metric** (`text`): the type of metric being collected.
149134

150-
There is a separate table for this collection to avoid duplicating crates, profiles, scenarios etc.
151-
many times in the `pstat` table.
135+
This corresponds to a [`statistic description`](../docs/glossary.md).
152136

153-
```
154-
sqlite> select * from pstat_series limit 1;
155-
id crate profile scenario backend target metric
156-
---------- ---------- ---------- ---------- ------- ------------ ------------
157-
1 helloworld check full llvm x86_64-linux-unknown-gnu task-clock:u
158-
```
137+
There is a separate table for this collection to avoid duplicating crates, profiles, scenarios etc... many times in the `pstat` table.
159138

160139
### pstat
161140

@@ -164,12 +143,12 @@ A measured value of a compile-time metric that is unique to a `pstat_series`, `a
164143
Each measured combination of a collection, rustc artifact, benchmarked crate, profile, scenario and a metric
165144
has its own unique entry in this table.
166145

167-
```
168-
sqlite> select * from pstat limit 1;
169-
series aid cid value
170-
---------- ---------- ---------- ----------
171-
1 1 1 24.93
172-
```
146+
Columns:
147+
148+
* **series** (`integer`): References `pstat_series` id
149+
* **aid** (`integer`): Artifact id, references the id in the artifact table
150+
* **cid** (`integer`): Collection id, references the id in the collection table
151+
* **value** (`double precision`): The value of the metric that has been measured, for example time
173152

174153
### runtime_pstat_series
175154

@@ -178,12 +157,11 @@ of a benchmark and the metric being collected.
178157

179158
This table exists to avoid duplicating crates, profiles, scenarios etc. many times in the `runtime_pstat` table.
180159

181-
```
182-
sqlite> select * from runtime_pstat_series limit 1;
183-
id benchmark metric
184-
---------- --------- --------------
185-
1 nbody-10k instructions:u
186-
```
160+
Columns:
161+
162+
* **id** (`integer`): Unique identifier
163+
* **benchmark** (`text`): The name of the benchmark
164+
* **metric** (`text`): The metric that was measured
187165

188166
### runtime_pstat
189167

@@ -192,36 +170,37 @@ A measured value of a runtime metric that is unique to a `runtime_pstat_series`,
192170
Each measured combination of a collection, rustc artifact, benchmark and a metric
193171
has its own unique entry in this table.
194172

195-
```
196-
sqlite> select * from runtime_pstat limit 1;
197-
series aid cid value
198-
---------- ---------- ---------- ----------
199-
1 1 1 24.93
200-
```
173+
Columns:
174+
175+
* **series** (`integer`): References `runtime_pstat_series` id
176+
* **aid** (`integer`): Artifact id, references the id in the artifact table
177+
* **cid** (`integer`): Collection id, references the id in the collection table
178+
* **value** (`double precision`): The value of the metric that has been measured, for example time
201179

202180
### rustc_compilation
203181

204182
Records the duration of compiling a `rustc` crate for a given artifact and collection.
205183

206-
```
207-
sqlite> select * from rustc_compilation limit 1;
208-
aid cid crate duration
209-
--- --- ---------- --------
210-
1 42 rustc_mir_transform 28.096
211-
```
184+
Columns:
185+
186+
* **aid** (`integer`): Artifact id, references the id in the artifact table
187+
* **cid** (`integer`): Collection id, references the id in the collection table
188+
* **crate** (`text`): The name of the rustc crate
189+
* **duration** (`big int`): How long compiling the rustc crate took
212190

213191
### raw_self_profile
214192

215193
Records that a given combination of artifact, collection, benchmark, profile and scenario
216194
has a self profile archive available. This profile is then downloaded through an endpoint -
217195
it is not stored in the database directly.
218196

219-
```
220-
sqlite> select * from raw_self_profile limit 1;
221-
aid cid crate profile cache
222-
--- --- ----- ------- -----
223-
1 42 hello-world debug full
224-
```
197+
Columns:
198+
199+
* **aid** (`integer`): Artifact id, references the id in the artifact table
200+
* **cid** (`integer`): Collection id, references the id in the collection table
201+
* **crate** (`text`): The name of the crate
202+
* **profile** (`text`): What type of compilation is happening - check build, optimized build (a.k.a. release build), debug build, or doc build.
203+
* **cache** (`text`): The incremental scenario used for the benchmark
225204

226205
### pull_request_build
227206

@@ -230,23 +209,16 @@ Records a pull request commit that is waiting in a queue to be benchmarked.
230209
First a merge commit is queued, then its artifacts are built by bors, and once the commit
231210
is attached to the entry in this table, it can be benchmarked.
232211

233-
* bors_sha: SHA of the commit that should be benchmarked
234-
* pr: number of the PR
235-
* parent_sha: SHA of the parent commit, to which will the PR be compared
236-
* complete: bool specifying whether this commit has been already benchmarked or not
237-
* requested: when was the commit queued
238-
* include: which benchmarks should be included (corresponds to the `--include` benchmark parameter)
239-
* exclude: which benchmarks should be excluded (corresponds to the `--exclude` benchmark parameter)
240-
* runs: how many iterations should be used by default for the benchmark run
241-
* commit_date: when was the commit created
242-
* backends: the codegen backends to use for the benchmarks (corresponds to the `--backends` benchmark parameter)
243-
244-
```
245-
sqlite> select * from pull_request_build limit 1;
246-
bors_sha pr parent_sha complete requested include exclude runs commit_date backends
247-
---------- -- ---------- -------- --------- ------- ------- ---- ----------- --------
248-
1w0p83... 42 fq24xq... true <timestamp> 3 <timestamp>
249-
```
212+
* **bors_sha** (`text`): SHA of the commit that should be benchmarked
213+
* **pr** (`integer`): number of the PR
214+
* **parent_sha** (`text`): SHA of the parent commit, to which will the PR be compared
215+
* **complete** (`boolean`): Specifies whether this commit has been already benchmarked or not
216+
* **requested** (`timestamptz`): When was the commit queued
217+
* **include** (`text`): Which benchmarks should be included (corresponds to the `--include` benchmark parameter), comma separated strings
218+
* **exclude** (`text`): Which benchmarks should be excluded (corresponds to the `--exclude` benchmark parameter), comma separated strings
219+
* **runs** (`integer`): How many iterations should be used by default for the benchmark run
220+
* **commit_date** (`timestamptz`): When was the commit created
221+
* **backends** (`text`): The codegen backends to use for the benchmarks (corresponds to the `--backends` benchmark parameter)
250222

251223
### error
252224

0 commit comments

Comments
 (0)