Skip to content

Commit 7f71b9f

Browse files
committed
restructure backups
1 parent 241b436 commit 7f71b9f

File tree

11 files changed

+1050
-2
lines changed

11 files changed

+1050
-2
lines changed

docs/operations_/backup_restore/00_overview.md

Lines changed: 307 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 336 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,336 @@
1+
---
2+
description: 'Details backup/restore to or from a local disk'
3+
sidebar_label: 'Local disk / S3 disk'
4+
slug: /operations/backup/disk
5+
title: 'Backup and Restore in ClickHouse'
6+
---
7+
8+
import GenericSettings from '@site/docs/operations_/backup_restore/_snippets/_generic_settings.md';
9+
import S3Settings from '@site/docs/operations_/backup_restore/_snippets/_s3_settings.md';
10+
import ExampleSetup from '@site/docs/operations_/backup_restore/_snippets/_example_setup.md';
11+
import Syntax from '@site/docs/operations_/backup_restore/_snippets/_syntax.md';
12+
13+
# BACKUP / RESTORE to disk {#backup-to-a-local-disk}
14+
15+
## Syntax {#syntax}
16+
17+
<Syntax/>
18+
19+
## Configure backup destinations for disk {#configure-backup-destinations-for-disk}
20+
21+
### Configure a backup destination for local disk {#configure-a-backup-destination}
22+
23+
In the examples below you will see the backup destination specified as `Disk('backups', '1.zip')`.
24+
To use the `Disk` backup engine it is necessary to first add a file specifying
25+
the backup destination at the path below:
26+
27+
```text
28+
/etc/clickhouse-server/config.d/backup_disk.xml
29+
```
30+
31+
For example, the configuration below defines a disk named `backups` and then adds that disk to
32+
the **allowed_disk** list of **backups**:
33+
34+
```xml
35+
<clickhouse>
36+
<storage_configuration>
37+
<disks>
38+
<!--highlight-next-line -->
39+
<backups>
40+
<type>local</type>
41+
<path>/backups/</path>
42+
</backups>
43+
</disks>
44+
</storage_configuration>
45+
<!--highlight-start -->
46+
<backups>
47+
<allowed_disk>backups</allowed_disk>
48+
<allowed_path>/backups/</allowed_path>
49+
</backups>
50+
<!--highlight-end -->
51+
</clickhouse>
52+
```
53+
54+
### Configure a backup destination for S3 disk {#backuprestore-using-an-s3-disk}
55+
56+
It is also possible to `BACKUP`/`RESTORE` to S3 by configuring an S3 disk in the
57+
ClickHouse storage configuration. Configure the disk like this by adding a file to
58+
`/etc/clickhouse-server/config.d` as was done above for the local disk.
59+
60+
```xml
61+
<clickhouse>
62+
<storage_configuration>
63+
<disks>
64+
<s3_plain>
65+
<type>s3_plain</type>
66+
<endpoint></endpoint>
67+
<access_key_id></access_key_id>
68+
<secret_access_key></secret_access_key>
69+
</s3_plain>
70+
</disks>
71+
<policies>
72+
<s3>
73+
<volumes>
74+
<main>
75+
<disk>s3_plain</disk>
76+
</main>
77+
</volumes>
78+
</s3>
79+
</policies>
80+
</storage_configuration>
81+
82+
<backups>
83+
<allowed_disk>s3_plain</allowed_disk>
84+
</backups>
85+
</clickhouse>
86+
```
87+
88+
`BACKUP`/`RESTORE` for S3 disk is done in the same way as for local disk:
89+
90+
```sql
91+
BACKUP TABLE data TO Disk('s3_plain', 'cloud_backup');
92+
RESTORE TABLE data AS data_restored FROM Disk('s3_plain', 'cloud_backup');
93+
```
94+
95+
:::note
96+
- This disk should not be used for `MergeTree` itself, only for `BACKUP`/`RESTORE`
97+
- If your tables are backed by S3 storage and the types of the disks are different,
98+
it doesn't use `CopyObject` calls to copy parts to the destination bucket, instead,
99+
it downloads and uploads them, which is very inefficient. In this case prefer using
100+
the `BACKUP ... TO S3(<endpoint>)` syntax for this use-case.
101+
:::
102+
103+
## Usage examples of backup/restore to local disk {#usage-examples}
104+
105+
### Backup and restore a table {#backup-and-restore-a-table}
106+
107+
<ExampleSetup/>
108+
109+
To backup the table you can run:
110+
111+
```sql title="Query"
112+
BACKUP TABLE test_db.test_table TO Disk('backups', '1.zip')
113+
```
114+
115+
```response title="Response"
116+
┌─id───────────────────────────────────┬─status─────────┐
117+
1. │ 065a8baf-9db7-4393-9c3f-ba04d1e76bcd │ BACKUP_CREATED │
118+
└──────────────────────────────────────┴────────────────┘
119+
```
120+
121+
The table can be restored from the backup using the following command if the table is empty:
122+
123+
```sql title="Query"
124+
RESTORE TABLE test_db.test_table FROM Disk('backups', '1.zip')
125+
```
126+
127+
```response title="Response"
128+
┌─id───────────────────────────────────┬─status───┐
129+
1. │ f29c753f-a7f2-4118-898e-0e4600cd2797 │ RESTORED │
130+
└──────────────────────────────────────┴──────────┘
131+
```
132+
133+
:::note
134+
The above `RESTORE` would fail if the table `test.table` contains data.
135+
The setting `allow_non_empty_tables=true` allows `RESTORE TABLE` to insert data
136+
into non-empty tables. This will mix earlier data in the table with the data extracted from the backup.
137+
This setting can therefore cause data duplication in the table, and should be used with caution.
138+
:::
139+
140+
To restore the table with data already in it, run:
141+
142+
```sql
143+
RESTORE TABLE test_db.table_table FROM Disk('backups', '1.zip')
144+
SETTINGS allow_non_empty_tables=true
145+
```
146+
147+
Tables can be restored, or backed up, with new names:
148+
149+
```sql
150+
RESTORE TABLE test_db.table_table AS test_db.test_table_renamed FROM Disk('backups', '1.zip')
151+
```
152+
153+
The backup archive for this backup has the following structure:
154+
155+
```text
156+
├── .backup
157+
└── metadata
158+
└── test_db
159+
└── test_table.sql
160+
```
161+
162+
<!-- TO DO:
163+
Explanation here about the backup format. See Issue 24a
164+
https://github.com/ClickHouse/clickhouse-docs/issues/3968
165+
-->
166+
167+
Formats other than zip can be used. See ["Backups as tar archives"](#backups-as-tar-archives)
168+
below for further details.
169+
170+
### Incremental backups to disk {#incremental-backups}
171+
172+
A base backup in ClickHouse is the initial, full backup from which the following
173+
incremental backups are created. Incremental backups only store the changes
174+
made since the base backup, so the base backup must be kept available to
175+
restore from any incremental backup. The base backup destination can be set with setting
176+
`base_backup`.
177+
178+
:::note
179+
Incremental backups depend on the base backup. The base backup must be kept available
180+
to be able to restore from an incremental backup.
181+
:::
182+
183+
To make an incremental backup of a table, first make a base backup:
184+
185+
```sql
186+
BACKUP TABLE test_db.test_table TO Disk('backups', 'd.zip')
187+
```
188+
189+
```sql
190+
BACKUP TABLE test_db.test_table TO Disk('backups', 'incremental-a.zip')
191+
SETTINGS base_backup = Disk('backups', 'd.zip')
192+
```
193+
194+
All data from the incremental backup and the base backup can be restored into a
195+
new table `test_db.test_table2` with command:
196+
197+
```sql
198+
RESTORE TABLE test_db.test_table AS test_db.test_table2
199+
FROM Disk('backups', 'incremental-a.zip');
200+
```
201+
202+
### Securing a backup {#assign-a-password-to-the-backup}
203+
204+
Backups written to disk can have a password applied to the file.
205+
The password can be specified using the `password` setting:
206+
207+
```sql
208+
BACKUP TABLE test_db.test_table
209+
TO Disk('backups', 'password-protected.zip')
210+
SETTINGS password='qwerty'
211+
```
212+
213+
To restore a password-protected backup, the password must again
214+
be specified using the `password` setting:
215+
216+
```sql
217+
RESTORE TABLE test_db.test_table
218+
FROM Disk('backups', 'password-protected.zip')
219+
SETTINGS password='qwerty'
220+
```
221+
222+
### Backups as tar archives {#backups-as-tar-archives}
223+
224+
Backups can be stored not only as zip archives, but also as tar archives.
225+
The functionality is the same as for zip, except that password protection is not
226+
supported for tar archives. Additionally, tar archives support a variety of
227+
compression methods.
228+
229+
To make a backup of a table as a tar:
230+
231+
```sql
232+
BACKUP TABLE test_db.test_table TO Disk('backups', '1.tar')
233+
```
234+
235+
to restore from a tar archive:
236+
237+
```sql
238+
RESTORE TABLE test_db.test_table FROM Disk('backups', '1.tar')
239+
```
240+
241+
To change the compression method, the correct file suffix should be appended to
242+
the backup name. For example, to compress the tar archive using gzip run:
243+
244+
```sql
245+
BACKUP TABLE test_db.test_table TO Disk('backups', '1.tar.gz')
246+
```
247+
248+
The supported compression file suffixes are:
249+
- `tar.gz`
250+
- `.tgz`
251+
- `tar.bz2`
252+
- `tar.lzma`
253+
- `.tar.zst`
254+
- `.tzst`
255+
- `.tar.xz`
256+
257+
### Compression settings {#compression-settings}
258+
259+
The compression method and level of compression can be specified using
260+
setting `compression_method` and `compression_level` respectively.
261+
262+
<!-- TO DO:
263+
More information needed on these settings and why you would want to do this
264+
-->
265+
266+
```sql
267+
BACKUP TABLE test_db.test_table
268+
TO Disk('backups', 'filename.zip')
269+
SETTINGS compression_method='lzma', compression_level=3
270+
```
271+
272+
### Restore specific partitions {#restore-specific-partitions}
273+
274+
If specific partitions associated with a table need to be restored, these can be specified.
275+
276+
Let's create a simple partitioned table into four parts, insert some data into it and then
277+
take a backup of only the first and fourth partitions:
278+
279+
<details>
280+
281+
<summary>Setup</summary>
282+
283+
```sql
284+
CREATE IF NOT EXISTS test_db;
285+
286+
-- Create a partitioend table
287+
CREATE TABLE test_db.partitioned (
288+
id UInt32,
289+
data String,
290+
partition_key UInt8
291+
) ENGINE = MergeTree()
292+
PARTITION BY partition_key
293+
ORDER BY id;
294+
295+
INSERT INTO test_db.partitioned VALUES
296+
(1, 'data1', 1),
297+
(2, 'data2', 2),
298+
(3, 'data3', 3),
299+
(4, 'data4', 4);
300+
301+
SELECT count() FROM test_db.partitioned;
302+
303+
SELECT partition_key, count()
304+
FROM test_db.partitioned
305+
GROUP BY partition_key
306+
ORDER BY partition_key;
307+
```
308+
309+
```response
310+
┌─count()─┐
311+
1. │ 4 │
312+
└─────────┘
313+
┌─partition_key─┬─count()─┐
314+
1. │ 1 │ 1 │
315+
2. │ 2 │ 1 │
316+
3. │ 3 │ 1 │
317+
4. │ 4 │ 1 │
318+
└───────────────┴─────────┘
319+
```
320+
321+
</details>
322+
323+
Run the following command to back up partitions 1 and 4:
324+
325+
```sql
326+
BACKUP TABLE test_db.partitioned PARTITIONS '1', '4'
327+
TO Disk('backups', 'partitioned.zip')
328+
```
329+
330+
Run the following command to restore partitions 1 and 4:
331+
332+
```sql
333+
RESTORE TABLE test_db.partitioned PARTITIONS '1', '4'
334+
FROM Disk('backups', 'partitioned.zip')
335+
SETTINGS allow_non_empty_tables=true
336+
```

0 commit comments

Comments
 (0)