Merge pull request #2291 from Jamesbarford/chore/update-schema.md

Kobzol · web-flow · commit bfbfe9283d7b · 2025-10-21T10:20:11.000Z
Update schema.md
diff --git a/database/schema.md b/database/schema.md
@@ -50,33 +50,22 @@ For runtime benchmarks the schema very similar, but there are different table na
 
 A description of a rustc compiler artifact being benchmarked.
 
-This description includes:
-* name: usually a commit sha or a tag like "1.51.0" but is free-form text so can be anything.
-* date: the date associated with this compiler artifact (usually only when the name is a commit)
-* type: currently one of "master" (i.e., we're testing a merge commit), "try" (someone is testing a PR), and "release" (usually a release candidate - though local compilers also get labeled like this).
+Columns:
 
-```
-sqlite> select * from artifact limit 1;
-id          name        date        type   
-----------  ----------  ----------  -------
-1           LOCAL_TEST              release
-```
+* **name** (`text`): Usually a commit sha "fefce3cecd69cebf2d7c9aa3dd90a84379f4201a" or a tag like "1.51.0" but is free-form text so could be anything.
+* **date** (`timestamptz`: The date associated with this compiler artifact (usually only when the name is a commit)
+* **type** (`text`): Currently one of "master" (i.e., we're testing a merge commit), "try" (someone is testing a PR), and "release" (usually a release candidate - though local compilers also get labeled like this).
 
 ### artifact size
 
 Records the size of individual components (like `librustc_driver.so` or `libLLVM.so`) of a single
 artifact.
 
-This description includes:
-* component: normalized name of the component (hashes are removed)
-* size: size of the component in bytes
+Columns:
 
-```
-sqlite> select * from artifact_size limit 1;
-aid         component   size
-----------  ----------  ----------
-1           libLLVM.so  177892352
-```
+* **aid** (`integer`): Artifact id, references the id in the artifact table
+* **component** (`text`): Normalized name of the component (hashes are removed)
+* **size** (`integer`): Size of the component in bytes
 
 ### collection
 
@@ -88,34 +77,31 @@ This is a way to collect statistics together signifying that they belong to the
 
 Currently, the collection also marks the git sha of the currently running collector binary.
 
-```
-sqlite> select * from collection limit 1;
-id          perf_commit 
-----------  ----------------------------------------
-1           d9fd96f409a15429757030f225b082744a72516c
-```
+Columns:
+
+* **id** (`integer`): Unique identifier
+* **perf_commit** (`text`): Commit sha / tag
 
 ### collector_progress
 
 Keeps track of the collector's start and finish time as well as which step it's currently on.
 
-```
-sqlite> select * from collector_progress limit 1;
-aid         step        start       end
-----------  ----------  ----------  ----------
-1           helloworld  1625829961  1625829965
-```
+Columns:
+
+* **aid** (`integer`): Artifact id, references the id in the artifact table
+* **step** (`text`): The step the collector is currently benchmarking
+* **start** (`timestamptz`): When the collector started
+* **end** (`timestamptz`): When the collector finished
 
 ### artifact_collection_duration
 
 Records how long benchmarking takes in seconds.
 
-```
-sqlite> select * from artifact_collection_duration limit 1;
-aid         date_recorded  duration
-----------  -------------  ----------
-1           1625829965     4
-```
+Columns:
+
+* **aid** (`integer`): Artifact id, references the id in the artifact table
+* **date_recorded** (`timestamptz`): When this was recorded
+* **duration** (`integer`): How long the benchmarking took in seconds
 
 ### benchmark
 
@@ -127,35 +113,28 @@ and its category. The benchmark name is used as a foreign key in many of the oth
 Category is either `primary` (real-world benchmark) or `secondary` (stress test).
 Stable benchmarks have `category` set to `primary` and `stabilized` set to `1`.
 
-```
-sqlite> select * from benchmark limit 1;
-name        stabilized  category
-----------  ----------  ----------
-helloworld  0           primary
-```
+Columns:
+
+* **name** (`text`): Name of the benchmark
+* **stabilized** (`boolean`): Whether the benchmark supports stable
+* **category** (`category`): `primary` if this is a 'real-world' benchmark or `secondary` if a 'stress test'
 
 ### pstat_series
 
 Describes the parametrization of a compile-time benchmark. Contains a unique combination
 of a crate, profile, scenario and the metric being collected.
 
-* crate (aka `benchmark`): the benchmarked crate which might be a crate from crates.io or a crate made specifically to stress some part of the compiler.
-* profile: what type of compilation is happening - check build, optimized build (a.k.a. release build), debug build, or doc build.
-* scenario: describes how much of the incremental cache is full. An empty incremental cache means that the compiler must do a full build.
-* backend: codegen backend used for compilation.
-* metric: the type of metric being collected.
+Columns:
 
-This corresponds to a [`statistic description`](../docs/glossary.md).
+* **crate** (`text`) (aka `benchmark`): The benchmarked crate which might be a crate from crates.io or a crate made specifically to stress some part of the compiler.
+* **profile** (`text`): What type of compilation is happening - check build, optimized build (a.k.a. release build), debug build, or doc build.
+* **scenario** (`text`): Describes how much of the incremental cache is full. An empty incremental cache means that the compiler must do a full build.
+* **backend** (`text`): Codegen backend used for compilation, for example 'llvm'
+* **metric** (`text`): the type of metric being collected.
 
-There is a separate table for this collection to avoid duplicating crates, profiles, scenarios etc.
-many times in the `pstat` table.
+This corresponds to a [`statistic description`](../docs/glossary.md).
 
-```
-sqlite> select * from pstat_series limit 1;
-id          crate       profile     scenario    backend  target                    metric      
-----------  ----------  ----------  ----------  -------  ------------              ------------
-1           helloworld  check       full        llvm     x86_64-linux-unknown-gnu  task-clock:u
-```
+There is a separate table for this collection to avoid duplicating crates, profiles, scenarios etc... many times in the `pstat` table.
 
 ### pstat
 
@@ -164,12 +143,12 @@ A measured value of a compile-time metric that is unique to a `pstat_series`, `a
 Each measured combination of a collection, rustc artifact, benchmarked crate, profile, scenario and a metric
 has its own unique entry in this table.
 
-```
-sqlite> select * from pstat limit 1;
-series      aid         cid         value
-----------  ----------  ----------  ----------
-1           1           1           24.93
-```
+Columns:
+
+* **series** (`integer`): References `pstat_series` id
+* **aid** (`integer`): Artifact id, references the id in the artifact table
+* **cid** (`integer`): Collection id, references the id in the collection table
+* **value** (`double precision`): The value of the metric that has been measured, for example time
 
 ### runtime_pstat_series
 
@@ -178,12 +157,11 @@ of a benchmark and the metric being collected.
 
 This table exists to avoid duplicating crates, profiles, scenarios etc. many times in the `runtime_pstat` table.
 
-```
-sqlite> select * from runtime_pstat_series limit 1;
-id          benchmark  metric
-----------  ---------  --------------
-1           nbody-10k  instructions:u
-```
+Columns:
+
+* **id** (`integer`): Unique identifier
+* **benchmark** (`text`): The name of the benchmark
+* **metric** (`text`): The metric that was measured
 
 ### runtime_pstat
 
@@ -192,36 +170,37 @@ A measured value of a runtime metric that is unique to a `runtime_pstat_series`,
 Each measured combination of a collection, rustc artifact, benchmark and a metric
 has its own unique entry in this table.
 
-```
-sqlite> select * from runtime_pstat limit 1;
-series      aid         cid         value
-----------  ----------  ----------  ----------
-1           1           1           24.93
-```
+Columns:
+
+* **series** (`integer`): References `runtime_pstat_series` id
+* **aid** (`integer`): Artifact id, references the id in the artifact table
+* **cid** (`integer`): Collection id, references the id in the collection table
+* **value** (`double precision`): The value of the metric that has been measured, for example time
 
 ### rustc_compilation
 
 Records the duration of compiling a `rustc` crate for a given artifact and collection.
 
-```
-sqlite> select * from rustc_compilation limit 1;
-aid  cid  crate                duration
----  ---  ----------           --------
-1    42   rustc_mir_transform  28.096
-```
+Columns:
+
+* **aid** (`integer`): Artifact id, references the id in the artifact table
+* **cid** (`integer`): Collection id, references the id in the collection table
+* **crate** (`text`):  The name of the rustc crate
+* **duration** (`big int`): How long compiling the rustc crate took
 
 ### raw_self_profile
 
 Records that a given combination of artifact, collection, benchmark, profile and scenario
 has a self profile archive available. This profile is then downloaded through an endpoint -
 it is not stored in the database directly.
 
-```
-sqlite> select * from raw_self_profile limit 1;
-aid  cid  crate        profile  cache
----  ---  -----        -------  -----
-1    42   hello-world  debug    full
-```
+Columns:
+
+* **aid** (`integer`): Artifact id, references the id in the artifact table
+* **cid** (`integer`): Collection id, references the id in the collection table
+* **crate** (`text`):  The name of the crate
+* **profile** (`text`): What type of compilation is happening - check build, optimized build (a.k.a. release build), debug build, or doc build.
+* **cache** (`text`): The incremental scenario used for the benchmark
 
 ### pull_request_build
 
@@ -230,23 +209,16 @@ Records a pull request commit that is waiting in a queue to be benchmarked.
 First a merge commit is queued, then its artifacts are built by bors, and once the commit
 is attached to the entry in this table, it can be benchmarked.
 
-* bors_sha: SHA of the commit that should be benchmarked
-* pr: number of the PR
-* parent_sha: SHA of the parent commit, to which will the PR be compared
-* complete: bool specifying whether this commit has been already benchmarked or not
-* requested: when was the commit queued
-* include: which benchmarks should be included (corresponds to the `--include` benchmark parameter)
-* exclude: which benchmarks should be excluded (corresponds to the `--exclude` benchmark parameter)
-* runs: how many iterations should be used by default for the benchmark run
-* commit_date: when was the commit created
-* backends: the codegen backends to use for the benchmarks (corresponds to the `--backends` benchmark parameter)
-
-```
-sqlite> select * from pull_request_build limit 1;
-bors_sha    pr  parent_sha  complete  requested    include  exclude  runs  commit_date  backends
-----------  --  ----------  --------  ---------    -------  -------  ----  -----------  --------
-1w0p83...   42  fq24xq...   true      <timestamp>                    3     <timestamp>
-```
+* **bors_sha** (`text`): SHA of the commit that should be benchmarked
+* **pr** (`integer`): number of the PR
+* **parent_sha** (`text`): SHA of the parent commit, to which will the PR be compared
+* **complete** (`boolean`): Specifies whether this commit has been already benchmarked or not
+* **requested** (`timestamptz`): When was the commit queued
+* **include** (`text`): Which benchmarks should be included (corresponds to the `--include` benchmark parameter), comma separated strings
+* **exclude** (`text`): Which benchmarks should be excluded (corresponds to the `--exclude` benchmark parameter), comma separated strings
+* **runs** (`integer`): How many iterations should be used by default for the benchmark run
+* **commit_date** (`timestamptz`): When was the commit created
+* **backends** (`text`): The codegen backends to use for the benchmarks (corresponds to the `--backends` benchmark parameter)
 
 ### error