Skip to content

Commit b77d217

Browse files
Let people know what they click on.
1 parent 60561ae commit b77d217

File tree

1 file changed

+9
-7
lines changed

1 file changed

+9
-7
lines changed

docs/concepts/why-clickhouse-is-so-fast.md

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -21,15 +21,17 @@ To avoid that too many parts accumulate, ClickHouse runs a [merge](/merges) oper
2121

2222
This approach has several advantages: All data processing can be [offloaded to background part merges](/concepts/why-clickhouse-is-so-fast#storage-layer-merge-time-computation), keeping data writes lightweight and highly efficient. Individual inserts are "local" in the sense that they do not need to update global, i.e. per-table data structures. As a result, multiple simultaneous inserts need no mutual synchronization or synchronization with existing table data, and thus inserts can be performed almost at the speed of disk I/O.
2323

24-
🤿 Deep dive into this [here](/docs/academic_overview#3-1-on-disk-format).
24+
the holistic performance optimization section of the VLDB paper.
25+
26+
🤿 Deep dive into this in the [On-Disk Format](/docs/academic_overview#3-1-on-disk-format) section of the web version of our VLDB 2024 paper.
2527

2628
## Storage Layer: Concurrent inserts and selects are isolated {#storage-layer-concurrent-inserts-and-selects-are-isolated}
2729

2830
<iframe width="768" height="432" src="https://www.youtube.com/embed/dvGlPh2bJFo?si=F3MSALPpe0gAoq5k" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
2931

3032
Inserts are fully isolated from SELECT queries, and merging inserted data parts happens in the background without affecting concurrent queries.
3133

32-
🤿 Deep dive into this [here](/docs/academic_overview#3-7-acid-compliance).
34+
🤿 Deep dive into this in the [Storage Layer](/docs/academic_overview#3-storage-layer) section of the web version of our VLDB 2024 paper.
3335

3436
## Storage Layer: Merge-time computation {#storage-layer-merge-time-computation}
3537

@@ -49,7 +51,7 @@ On the one hand, user queries may become significantly faster, sometimes by 1000
4951

5052
On the other hand, the majority of the runtime of merges is consumed by loading the input parts and saving the output part. The additional effort to transform the data during merge does usually not impact the runtime of merges too much. All of this magic is completely transparent and does not affect the result of queries (besides their performance).
5153

52-
🤿 Deep dive into this [here](/docs/academic_overview#3-3-merge-time-data-transformation).
54+
🤿 Deep dive into this in the [Merge-time Data Transformation](/docs/academic_overview#3-3-merge-time-data-transformation) section of the web version of our VLDB 2024 paper.
5355

5456
## Storage Layer: Data pruning {#storage-layer-data-pruning}
5557

@@ -65,7 +67,7 @@ In practice, many queries are repetitive, i.e., run unchanged or only with sligh
6567

6668
All three techniques aim to skip as many rows during full-column reads as possible because the fastest way to read data is to not read it at all.
6769

68-
🤿 Deep dive into this [here](/docs/academic_overview#3-2-data-pruning).
70+
🤿 Deep dive into this in the [Data Pruning](/docs/academic_overview#3-2-data-pruning) section of the web version of our VLDB 2024 paper.
6971

7072
## Storage Layer: Data compression {#storage-layer-data-compression}
7173

@@ -79,7 +81,7 @@ Users can [specify](https://clickhouse.com/blog/optimize-clickhouse-codecs-compr
7981

8082
Data compression not only reduces the storage size of the database tables, but in many cases, it also improves query performance as local disks and network I/O are often constrained by low throughput.
8183

82-
🤿 Deep dive into this [here](/docs/academic_overview#3-1-on-disk-format).
84+
🤿 Deep dive into this in the [On-Disk Format](/docs/academic_overview#3-1-on-disk-format) section of the web version of our VLDB 2024 paper.
8385

8486
## State-of-the-art query processing layer {#state-of-the-art-query-processing-layer}
8587

@@ -91,7 +93,7 @@ Modern systems have dozens of CPU cores. To utilize all cores, ClickHouse unfold
9193

9294
If a single node becomes too small to hold the table data, further nodes can be added to form a cluster. Tables can be split ("sharded") and distributed across the nodes. ClickHouse will run queries on all nodes that store table data and thereby scale "horizontally" with the number of available nodes.
9395

94-
🤿 Deep dive into this [here](/academic_overview#4-query-processing-layer).
96+
🤿 Deep dive into this in the [Query Processing Layer](/academic_overview#4-query-processing-layer) section of the web version of our VLDB 2024 paper.
9597

9698
## Meticulous attention to detail {#meticulous-attention-to-detail}
9799

@@ -121,7 +123,7 @@ The [hash table implementation in ClickHouse](https://clickhouse.com/blog/hash-t
121123

122124
Algorithms that rely on data characteristics often perform better than their generic counterparts. If the data characteristics are not known in advance, the system can try various implementations and choose the one that works best at runtime. For an example, see the [article on how LZ4 decompression is implemented in ClickHouse](https://habr.com/en/company/yandex/blog/457612/).
123125

124-
🤿 Deep dive into this [here](/academic_overview#4-4-holistic-performance-optimization).
126+
🤿 Deep dive into this in the [Holistic Performance Optimization](/academic_overview#4-4-holistic-performance-optimization) section of the web version of our VLDB 2024 paper.
125127

126128
## VLDB 2024 paper {#vldb-2024-paper}
127129

0 commit comments

Comments
 (0)