Skip to content

Commit fb23006

Browse files
committed
more images
1 parent 8b5233a commit fb23006

File tree

8 files changed

+18
-26
lines changed

8 files changed

+18
-26
lines changed

docs/data-modeling/schema-design.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ keywords: ['schema', 'schema design', 'query optimization']
88
import stackOverflowSchema from '@site/static/images/data-modeling/stackoverflow-schema.png';
99
import schemaDesignTypes from '@site/static/images/data-modeling/schema-design-types.png';
1010
import schemaDesignIndices from '@site/static/images/data-modeling/schema-design-indices.png';
11+
import Image from '@theme/IdealImage';
1112

1213
Understanding effective schema design is key to optimizing ClickHouse performance and includes choices that often involve trade-offs, with the optimal approach depending on the queries being served as well as factors such as data update frequency, latency requirements, and data volume. This guide provides an overview of schema design best practices and data modeling techniques for optimizing ClickHouse performance.
1314

@@ -17,7 +18,7 @@ For the examples in this guide, we use a subset of the Stack Overflow dataset. T
1718

1819
> The primary keys and relationships indicated are not enforced through constraints (Parquet is file not table format) and purely indicate how the data is related and the unique keys it possesses.
1920
20-
<img src={stackOverflowSchema} class="image" alt="Stack Overflow Schema" style={{width: '800px', background: 'none'}} />
21+
<Image img={stackOverflowSchema} size="lg" alt="Stack Overflow Schema"/>
2122

2223
<br />
2324

@@ -150,7 +151,7 @@ FixedString for special cases - Strings which have a fixed length can be encoded
150151
151152
By applying these simple rules to our posts table, we can identify an optimal type for each column:
152153

153-
<img src={schemaDesignTypes} class="image" alt="Schema Design - Optimized Types" style={{width: '1000px', background: 'none'}} />
154+
<Image img={schemaDesignTypes} size="lg" alt="Schema Design - Optimized Types"/>
154155

155156
<br />
156157

@@ -203,9 +204,7 @@ Users coming from OLTP databases often look for the equivalent concept in ClickH
203204

204205
At the scale at which ClickHouse is often used, memory and disk efficiency are paramount. Data is written to ClickHouse tables in chunks known as parts, with rules applied for merging the parts in the background. In ClickHouse, each part has its own primary index. When parts are merged, then the merged part's primary indexes are also merged. The primary index for a part has one index entry per group of rows - this technique is called sparse indexing.
205206

206-
<img src={schemaDesignIndices} class="image" alt="Sparse Indexing in ClickHouse" style={{width: '600px', background: 'none'}} />
207-
208-
<br />
207+
<Image img={schemaDesignIndices} size="md" alt="Sparse Indexing in ClickHouse"/>
209208

210209
The selected key in ClickHouse will determine not only the index, but also order in which data is written on disk. Because of this, it can dramatically impact compression levels which can in turn affect query performance. An ordering key which causes the values of most columns to be written in contiguous order will allow the selected compression algorithm (and codecs) to compress the data more effectively.
211210

docs/dictionary/index.md

Lines changed: 3 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ description: 'A dictionary provides a key-value representation of data for fast
77

88
import dictionaryUseCases from '@site/static/images/dictionary/dictionary-use-cases.png';
99
import dictionaryLeftAnyJoin from '@site/static/images/dictionary/dictionary-left-any-join.png';
10+
import Image from '@theme/IdealImage';
1011

1112
# Dictionary
1213

@@ -16,19 +17,13 @@ Dictionaries are useful for:
1617
- Improving the performance of queries, especially when used with `JOIN`s
1718
- Enriching ingested data on the fly without slowing down the ingestion process
1819

19-
<img src={dictionaryUseCases}
20-
class="image"
21-
alt="Use cases for Dictionary in ClickHouse"
22-
style={{width: '100%', background: 'none'}} />
20+
<Image img={dictionaryUseCases} size="lg" alt="Use cases for Dictionary in ClickHouse"/>
2321

2422
## Speeding up joins using a Dictionary {#speeding-up-joins-using-a-dictionary}
2523

2624
Dictionaries can be used to speed up a specific type of `JOIN`: the [`LEFT ANY` type](/sql-reference/statements/select/join#supported-types-of-join) where the join key needs to match the key attribute of the underlying key-value storage.
2725

28-
<img src={dictionaryLeftAnyJoin}
29-
class="image"
30-
alt="Using Dictionary with LEFT ANY JOIN"
31-
style={{width: '300px', background: 'none'}} />
26+
<Image img={dictionaryLeftAnyJoin} size="sm" alt="Using Dictionary with LEFT ANY JOIN"/>
3227

3328
If this is the case, ClickHouse can exploit the dictionary to perform a [Direct Join](https://clickhouse.com/blog/clickhouse-fully-supports-joins-direct-join-part4#direct-join). This is ClickHouse's fastest join algorithm and is applicable when the underlying [table engine](/engines/table-engines) for the right-hand side table supports low-latency key-value requests. ClickHouse has three table engines providing this: [Join](/engines/table-engines/special/join) (that is basically a pre-calculated hash table), [EmbeddedRocksDB](/engines/table-engines/integrations/embedded-rocksdb) and [Dictionary](/engines/table-engines/special/dictionary). We will describe the dictionary-based approach, but the mechanics are the same for all three engines.
3429

docs/managing-data/drop_partition.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ PARTITION BY toYear(CreationDate)
2929

3030
Read about setting the partition expression in a section [How to set the partition expression](/sql-reference/statements/alter/partition/#how-to-set-partition-expression).
3131

32-
In ClickHouse, users should principally consider partitioning to be a data management feature, not a query optimization technique. By separating data logically based on a key, each partition can be operated on independently e.g. deleted. This allows users to move partitions, and thus subsets, between [storage tiers](/integrations/s3#storage-tiers) efficiently on time or [expire data/efficiently delete from the cluster](/sql-reference/statements/alter/partition).
32+
In ClickHouse, users should principally consider partitioning to be a data management feature, not a query optimization technique. By separating data logically based on a key, each partition can be operated on independently e.g. deleted. This allows users to move partitions, and thus subsets, between [storage tiers](/integrations/s3#storage-tiers) efficiently on time or [expire data/efficiently delete from the cluster](/sql-reference/statements/alter/partition).
3333

3434
## Drop Partitions {#drop-partitions}
3535

@@ -69,7 +69,7 @@ WHERE `table` = 'posts'
6969
└───────────┘
7070

7171
17 rows in set. Elapsed: 0.002 sec.
72-
72+
7373
ALTER TABLE posts
7474
(DROP PARTITION '2008')
7575

docs/materialized-view/incremental-materialized-view.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ score: 10000
77
---
88

99
import materializedViewDiagram from '@site/static/images/materialized-view/materialized-view-diagram.png';
10+
import Image from '@theme/IdealImage';
1011

1112
# Incremental Materialized Views
1213

@@ -18,10 +19,7 @@ The principal motivation for materialized views is that the results inserted int
1819

1920
Materialized views in ClickHouse are updated in real time as data flows into the table they are based on, functioning more like continually updating indexes. This is in contrast to other databases where materialized views are typically static snapshots of a query that must be refreshed (similar to ClickHouse [refreshable materialized views](/sql-reference/statements/create/view#refreshable-materialized-view)).
2021

21-
<img src={materializedViewDiagram}
22-
class="image"
23-
alt="Materialized view diagram"
24-
style={{width: '500px'}} />
22+
<Image img={materializedViewDiagram} size="md" alt="Materialized view diagram"/>
2523

2624
## Example {#example}
2725

docs/materialized-view/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,4 @@ keywords: ['materialized views', 'speed up queries', 'query optimization', 'refr
1111
| [Refreshable Materialized View](/materialized-view/refreshable-materialized-view) | Conceptually similar to incremental materialized views but require the periodic execution of the query over the full dataset - the results of which are stored in a target table for querying. |
1212

1313

14-
<iframe width="560" height="315" src="https://www.youtube.com/embed/-A3EtQgDn_0?si=TBiN_E80BKZ0DPpd" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
14+
<iframe width="1024" height="576" src="https://www.youtube.com/embed/-A3EtQgDn_0?si=TBiN_E80BKZ0DPpd" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>

docs/materialized-view/refreshable-materialized-view.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,15 +6,14 @@ keywords: ['refreshable materialized view', 'refresh', 'materialized views', 'sp
66
---
77

88
import refreshableMaterializedViewDiagram from '@site/static/images/materialized-view/refreshable-materialized-view-diagram.png';
9+
import Image from '@theme/IdealImage';
910

1011
[Refreshable materialized views](/sql-reference/statements/create/view#refreshable-materialized-view) are conceptually similar to materialized views in traditional OLTP databases, storing the result of a specified query for quick retrieval and reducing the need to repeatedly execute resource-intensive queries. Unlike ClickHouse’s [incremental materialized views](/materialized-view/incremental-materialized-view), this requires the periodic execution of the query over the full dataset - the results of which are stored in a target table for querying. This result set should, in theory, be smaller than the original dataset, allowing the subsequent query to execute faster.
1112

1213
The diagram explains how Refreshable Materialized Views work:
1314

14-
<img src={refreshableMaterializedViewDiagram}
15-
class="image"
16-
alt="Refreshable materialized view diagram"
17-
style={{width: '100%', background: 'none'}} />
15+
<Image img={refreshableMaterializedViewDiagram} size="lg" alt="Refreshable materialized view diagram"/>
16+
1817

1918
You can also see the following video:
2019

docs/migrations/postgres/replacing-merge-tree.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ keywords: ['replacingmergetree', 'inserts', 'deduplication']
66
---
77

88
import postgres_replacingmergetree from '@site/static/images/migrations/postgres-replacingmergetree.png';
9+
import Image from '@theme/IdealImage';
910

1011
While transactional databases are optimized for transactional update and delete workloads, OLAP databases offer reduced guarantees for such operations. Instead, they optimize for immutable data inserted in batches for the benefit of significantly faster analytical queries. While ClickHouse offers update operations through mutations, as well as a lightweight means of deleting rows, its column-orientated structure means these operations should be scheduled with care, as described above. These operations are handled asynchronously, processed with a single thread, and require (in the case of updates) data to be rewritten on disk. They should thus not be used for high numbers of small changes.
1112
In order to process a stream of update and delete rows while avoiding the above usage patterns, we can use the ClickHouse table engine ReplacingMergeTree.
@@ -28,7 +29,7 @@ As a result of this merge process, we have four rows representing the final stat
2829

2930
<br />
3031

31-
<img src={postgres_replacingmergetree} class="image" alt="ReplacingMergeTree process" style={{width: '800px', background: 'none'}} />
32+
<Image img={postgres_replacingmergetree} size="md" alt="ReplacingMergeTree process"/>
3233

3334
<br />
3435

src/theme/IdealImage/index.tsx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -181,7 +181,7 @@ export default function IdealImage(
181181
<ControlledZoom
182182
isZoomed={isZoomed}
183183
onZoomChange={handleZoomChange}
184-
classDialog={`${styles.customZoom} ${styles.customWhiteZoom}`}
184+
classDialog={`${styles.customZoom} ${background == "white" ? styles.customWhiteZoom : ""}`}
185185
>
186186
{isLoaded && (
187187
<img

0 commit comments

Comments
 (0)