Skip to content

Commit 74608d6

Browse files
committed
Update storage-efficiency.md
1 parent a72c655 commit 74608d6

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

docs/use-cases/time-series/04_storage-efficiency.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,12 @@ doc_type: 'guide'
1010

1111
# Time-series storage efficiency
1212

13-
After exploring how to query our Wikipedia statistics dataset, let's focus on optimizing its storage efficiency in ClickHouse.
13+
After exploring how to query our Wikipedia statistics dataset, let's focus on optimizing its storage efficiency in ClickHouse.
1414
This section demonstrates practical techniques to reduce storage requirements while maintaining query performance.
1515

1616
## Type optimization {#time-series-type-optimization}
1717

18-
The general approach to optimizing storage efficiency is using optimal data types.
18+
The general approach to optimizing storage efficiency is using optimal data types.
1919
Let's take the `project` and `subproject` columns. These columns are of type String, but have a relatively small amount of unique values:
2020

2121
```sql
@@ -39,7 +39,7 @@ MODIFY COLUMN `project` LowCardinality(String),
3939
MODIFY COLUMN `subproject` LowCardinality(String)
4040
```
4141

42-
We've also used UInt64 type for the hits column, which takes 8 bytes, but has a relatively small max value:
42+
We've also used a UInt64 type for the `hits` column, which takes 8 bytes, but has a relatively small max value:
4343

4444
```sql
4545
SELECT max(hits)
@@ -59,23 +59,23 @@ ALTER TABLE wikistat
5959
MODIFY COLUMN `hits` UInt32;
6060
```
6161

62-
This will reduce the size of this column in memory by at least 2 times. Note that the size on disk will remain unchanged due to compression. But be careful, pick data types that are not too small!
62+
This will reduce the size of this column in memory by at least a factor of two. Note that the size on disk will remain unchanged due to compression. But be careful, pick data types that are not too small!
6363

6464
## Specialized codecs {#time-series-specialized-codecs}
6565

66-
When we deal with sequential data, like time-series, we can further improve storage efficiency by using special codecs.
66+
When we deal with sequential data, like time-series, we can further improve storage efficiency by using special codecs.
6767
The general idea is to store changes between values instead of absolute values themselves, which results in much less space needed when dealing with slowly changing data:
6868

6969
```sql
7070
ALTER TABLE wikistat
7171
MODIFY COLUMN `time` CODEC(Delta, ZSTD);
7272
```
7373

74-
We've used the Delta codec for time column, which is a good fit for time series data.
74+
We've used the Delta codec for the `time` column, which is a good fit for time-series data.
7575

76-
The right ordering key can also save disk space.
76+
The right ordering key can also save disk space.
7777
Since we usually want to filter by a path, we will add `path` to the sorting key.
78-
This requires recreation of the table.
78+
This requires recreation of the table.
7979

8080
Below we can see the `CREATE` command for our initial table and the optimized table:
8181

0 commit comments

Comments
 (0)