You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/best-practices/json_type.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ show_related_blogs: true
9
9
doc_type: 'reference'
10
10
---
11
11
12
-
ClickHouse now offers a native JSON column type designed for semi-structured and dynamic data. It's important to clarify that **this is a column type, not a data format**—you can insert JSON into ClickHouse as a string or via supported formats like [JSONEachRow](/docs/interfaces/formats/JSONEachRow), but that does not imply using the JSON column type. Users should only use the JSON type when the structure of their data is dynamic, not when they simply happen to store JSON.
12
+
ClickHouse now offers a native JSON column type designed for semi-structured and dynamic data. It's important to clarify that **this is a column type, not a data format**—you can insert JSON into ClickHouse as a string or via supported formats like [JSONEachRow](/interfaces/formats/JSONEachRow), but that does not imply using the JSON column type. Users should only use the JSON type when the structure of their data is dynamic, not when they simply happen to store JSON.
13
13
14
14
## When to use the JSON type {#when-to-use-the-json-type}
Copy file name to clipboardExpand all lines: docs/chdb/guides/querying-s3-bucket.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,7 +49,7 @@ To do this, we can use the [`s3` table function](/sql-reference/table-functions/
49
49
If you pass just the bucket name it will throw an exception.
50
50
:::
51
51
52
-
We're also going to use the [`One`](/interfaces/formats#data-format-one) input format so that the file isn't parsed, instead a single row is returned per file and we can access the file via the `_file` virtual column and the path via the `_path` virtual column.
52
+
We're also going to use the [`One`](/interfaces/formats/One) input format so that the file isn't parsed, instead a single row is returned per file and we can access the file via the `_file` virtual column and the path via the `_path` virtual column.
Copy file name to clipboardExpand all lines: docs/integrations/data-ingestion/data-formats/arrow-avro-orc.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@ Apache has released multiple data formats actively used in analytics environment
15
15
16
16
ClickHouse supports reading and writing [Apache Avro](https://avro.apache.org/) data files, which are widely used in Hadoop systems.
17
17
18
-
To import from an [avro file](assets/data.avro), we should use [Avro](/interfaces/formats.md/#data-format-avro) format in the `INSERT` statement:
18
+
To import from an [avro file](assets/data.avro), we should use [Avro](/interfaces/formats/Avro) format in the `INSERT` statement:
19
19
20
20
```sql
21
21
INSERT INTO sometable
@@ -70,7 +70,7 @@ LIMIT 3;
70
70
71
71
### Avro messages in Kafka {#avro-messages-in-kafka}
72
72
73
-
When Kafka messages use Avro format, ClickHouse can read such streams using [AvroConfluent](/interfaces/formats.md/#data-format-avro-confluent) format and [Kafka](/engines/table-engines/integrations/kafka.md) engine:
73
+
When Kafka messages use Avro format, ClickHouse can read such streams using [AvroConfluent](/interfaces/formats/AvroConfluent) format and [Kafka](/engines/table-engines/integrations/kafka.md) engine:
74
74
75
75
```sql
76
76
CREATETABLEsome_topic_stream
@@ -87,7 +87,7 @@ kafka_format = 'AvroConfluent';
87
87
88
88
## Working with Arrow format {#working-with-arrow-format}
89
89
90
-
Another columnar format is [Apache Arrow](https://arrow.apache.org/), also supported by ClickHouse for import and export. To import data from an [Arrow file](assets/data.arrow), we use the [Arrow](/interfaces/formats.md/#data-format-arrow) format:
90
+
Another columnar format is [Apache Arrow](https://arrow.apache.org/), also supported by ClickHouse for import and export. To import data from an [Arrow file](assets/data.arrow), we use the [Arrow](/interfaces/formats/Arrow) format:
The [ArrowStream](/interfaces/formats.md/#data-format-arrow-stream) format can be used to work with Arrow streaming (used for in-memory processing). ClickHouse can read and write Arrow streams.
110
+
The [ArrowStream](/interfaces/formats/ArrowStream) format can be used to work with Arrow streaming (used for in-memory processing). ClickHouse can read and write Arrow streams.
111
111
112
112
To demonstrate how ClickHouse can stream Arrow data, let's pipe it to the following python script (it reads input stream in Arrow streaming format and outputs the result as a Pandas table):
113
113
@@ -140,7 +140,7 @@ We've used `arrow-stream` as a possible source of Arrow streaming data.
140
140
141
141
## Importing and exporting ORC data {#importing-and-exporting-orc-data}
142
142
143
-
[Apache ORC](https://orc.apache.org/) format is a columnar storage format typically used for Hadoop. ClickHouse supports importing as well as exporting [Orc data](assets/data.orc) using [ORC format](/interfaces/formats.md/#data-format-orc):
143
+
[Apache ORC](https://orc.apache.org/) format is a columnar storage format typically used for Hadoop. ClickHouse supports importing as well as exporting [Orc data](assets/data.orc) using [ORC format](/interfaces/formats/ORC):
Copy file name to clipboardExpand all lines: docs/integrations/data-ingestion/data-formats/binary.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,7 +16,7 @@ We're going to use some_data [table](assets/some_data.sql) and [data](assets/som
16
16
17
17
## Exporting in a Native ClickHouse format {#exporting-in-a-native-clickhouse-format}
18
18
19
-
The most efficient data format to export and import data between ClickHouse nodes is [Native](/interfaces/formats.md/#native) format. Exporting is done using `INTO OUTFILE` clause:
19
+
The most efficient data format to export and import data between ClickHouse nodes is [Native](/interfaces/formats/Native) format. Exporting is done using `INTO OUTFILE` clause:
20
20
21
21
```sql
22
22
SELECT*FROM some_data
@@ -74,7 +74,7 @@ FORMAT Native
74
74
75
75
## Exporting to RowBinary {#exporting-to-rowbinary}
76
76
77
-
Another binary format supported is [RowBinary](/interfaces/formats.md/#rowbinary), which allows importing and exporting data in binary-represented rows:
77
+
Another binary format supported is [RowBinary](/interfaces/formats/RowBinary), which allows importing and exporting data in binary-represented rows:
Consider using [RowBinaryWithNames](/interfaces/formats.md/#rowbinarywithnames), which also adds a header row with a columns list. [RowBinaryWithNamesAndTypes](/interfaces/formats.md/#rowbinarywithnamesandtypes) will also add an additional header row with column types.
104
+
Consider using [RowBinaryWithNames](/interfaces/formats/RowBinaryWithNames), which also adds a header row with a columns list. [RowBinaryWithNamesAndTypes](/interfaces/formats/RowBinaryWithNamesAndTypes) will also add an additional header row with column types.
105
105
106
106
### Importing from RowBinary files {#importing-from-rowbinary-files}
107
107
To load data from a RowBinary file, we can use a `FROM INFILE` clause:
@@ -115,7 +115,7 @@ FORMAT RowBinary
115
115
## Importing single binary value using RawBLOB {#importing-single-binary-value-using-rawblob}
116
116
117
117
Suppose we want to read an entire binary file and save it into a field in a table.
118
-
This is the case when the [RawBLOB format](/interfaces/formats.md/#rawblob) can be used. This format can be directly used with a single-column table only:
118
+
This is the case when the [RawBLOB format](/interfaces/formats/RawBLOB) can be used. This format can be directly used with a single-column table only:
119
119
120
120
```sql
121
121
CREATETABLEimages(data String) ENGINE = Memory
@@ -152,7 +152,7 @@ Note that we had to use `LIMIT 1` because exporting more than a single value wil
152
152
153
153
## MessagePack {#messagepack}
154
154
155
-
ClickHouse supports importing and exporting to [MessagePack](https://msgpack.org/) using the [MsgPack](/interfaces/formats.md/#msgpack). To export to MessagePack format:
155
+
ClickHouse supports importing and exporting to [MessagePack](https://msgpack.org/) using the [MsgPack](/interfaces/formats/MsgPack). To export to MessagePack format:
156
156
157
157
```sql
158
158
SELECT*
@@ -173,7 +173,7 @@ FORMAT MsgPack
173
173
174
174
<CloudNotSupportedBadge/>
175
175
176
-
To work with [Protocol Buffers](/interfaces/formats.md/#protobuf) we first need to define a [schema file](assets/schema.proto):
176
+
To work with [Protocol Buffers](/interfaces/formats/Protobuf) we first need to define a [schema file](assets/schema.proto):
177
177
178
178
```protobuf
179
179
syntax = "proto3";
@@ -185,7 +185,7 @@ message MessageType {
185
185
};
186
186
```
187
187
188
-
Path to this schema file (`schema.proto` in our case) is set in a `format_schema` settings option for the [Protobuf](/interfaces/formats.md/#protobuf) format:
188
+
Path to this schema file (`schema.proto` in our case) is set in a `format_schema` settings option for the [Protobuf](/interfaces/formats/Protobuf) format:
189
189
190
190
```sql
191
191
SELECT*FROM some_data
@@ -194,7 +194,7 @@ FORMAT Protobuf
194
194
SETTINGS format_schema ='schema:MessageType'
195
195
```
196
196
197
-
This saves data to the [proto.bin](assets/proto.bin) file. ClickHouse also supports importing Protobuf data as well as nested messages. Consider using [ProtobufSingle](/interfaces/formats.md/#protobufsingle) to work with a single Protocol Buffer message (length delimiters will be omitted in this case).
197
+
This saves data to the [proto.bin](assets/proto.bin) file. ClickHouse also supports importing Protobuf data as well as nested messages. Consider using [ProtobufSingle](/interfaces/formats/ProtobufSingle) to work with a single Protocol Buffer message (length delimiters will be omitted in this case).
198
198
199
199
## Cap'n Proto {#capn-proto}
200
200
@@ -212,7 +212,7 @@ struct PathStats {
212
212
}
213
213
```
214
214
215
-
Now we can import and export using [CapnProto](/interfaces/formats.md/#capnproto) format and this schema:
215
+
Now we can import and export using [CapnProto](/interfaces/formats/CapnProto) format and this schema:
Copy file name to clipboardExpand all lines: docs/integrations/data-ingestion/data-formats/csv-tsv.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,7 +31,7 @@ To import data from the [CSV file](assets/data_small.csv) to the `sometable` tab
31
31
clickhouse-client -q "INSERT INTO sometable FORMAT CSV"< data_small.csv
32
32
```
33
33
34
-
Note that we use [FORMAT CSV](/interfaces/formats.md/#csv) to let ClickHouse know we're ingesting CSV formatted data. Alternatively, we can load data from a local file using the [FROM INFILE](/sql-reference/statements/insert-into.md/#inserting-data-from-a-file) clause:
34
+
Note that we use [FORMAT CSV](/interfaces/formats/CSV) to let ClickHouse know we're ingesting CSV formatted data. Alternatively, we can load data from a local file using the [FROM INFILE](/sql-reference/statements/insert-into.md/#inserting-data-from-a-file) clause:
35
35
36
36
```sql
37
37
INSERT INTO sometable
@@ -59,7 +59,7 @@ head data-small-headers.csv
59
59
"Aegithina_tiphia","2018-02-01",34
60
60
```
61
61
62
-
To import data from this file, we can use [CSVWithNames](/interfaces/formats.md/#csvwithnames) format:
62
+
To import data from this file, we can use [CSVWithNames](/interfaces/formats/CSVWithNames) format:
63
63
64
64
```bash
65
65
clickhouse-client -q "INSERT INTO sometable FORMAT CSVWithNames"< data_small_headers.csv
@@ -153,17 +153,17 @@ SELECT * FROM file('nulls.csv')
Tab-separated data format is widely used as a data interchange format. To load data from a [TSV file](assets/data_small.tsv) to ClickHouse, the [TabSeparated](/interfaces/formats.md/#tabseparated) format is used:
156
+
Tab-separated data format is widely used as a data interchange format. To load data from a [TSV file](assets/data_small.tsv) to ClickHouse, the [TabSeparated](/interfaces/formats/TabSeparated) format is used:
157
157
158
158
```bash
159
159
clickhouse-client -q "INSERT INTO sometable FORMAT TabSeparated"< data_small.tsv
160
160
```
161
161
162
-
There's also a [TabSeparatedWithNames](/interfaces/formats.md/#tabseparatedwithnames) format to allow working with TSV files that have headers. And, like for CSV, we can skip the first X lines using the [input_format_tsv_skip_first_lines](/operations/settings/settings-formats.md/#input_format_tsv_skip_first_lines) option.
162
+
There's also a [TabSeparatedWithNames](/interfaces/formats/TabSeparatedWithNames) format to allow working with TSV files that have headers. And, like for CSV, we can skip the first X lines using the [input_format_tsv_skip_first_lines](/operations/settings/settings-formats.md/#input_format_tsv_skip_first_lines) option.
163
163
164
164
### Raw TSV {#raw-tsv}
165
165
166
-
Sometimes, TSV files are saved without escaping tabs and line breaks. We should use [TabSeparatedRaw](/interfaces/formats.md/#tabseparatedraw) to handle such files.
166
+
Sometimes, TSV files are saved without escaping tabs and line breaks. We should use [TabSeparatedRaw](/interfaces/formats/TabSeparatedRaw) to handle such files.
To add a header to the CSV file, we use the [CSVWithNames](/interfaces/formats.md/#csvwithnames) format:
186
+
To add a header to the CSV file, we use the [CSVWithNames](/interfaces/formats/CSVWithNames) format:
187
187
188
188
```sql
189
189
SELECT*
@@ -273,7 +273,7 @@ All column types will be treated as a `String` in this case.
273
273
274
274
### Exporting and importing CSV with explicit column types {#exporting-and-importing-csv-with-explicit-column-types}
275
275
276
-
ClickHouse also allows explicitly setting column types when exporting data using [CSVWithNamesAndTypes](/interfaces/formats.md/#csvwithnamesandtypes) (and other *WithNames formats family):
276
+
ClickHouse also allows explicitly setting column types when exporting data using [CSVWithNamesAndTypes](/interfaces/formats/CSVWithNamesAndTypes) (and other *WithNames formats family):
277
277
278
278
```sql
279
279
SELECT*
@@ -308,7 +308,7 @@ Now ClickHouse identifies column types based on a (second) header row instead of
308
308
309
309
## Custom delimiters, separators, and escaping rules {#custom-delimiters-separators-and-escaping-rules}
310
310
311
-
In sophisticated cases, text data can be formatted in a highly custom manner but still have a structure. ClickHouse has a special [CustomSeparated](/interfaces/formats.md/#format-customseparated) format for such cases, which allows setting custom escaping rules, delimiters, line separators, and starting/ending symbols.
311
+
In sophisticated cases, text data can be formatted in a highly custom manner but still have a structure. ClickHouse has a special [CustomSeparated](/interfaces/formats/CustomSeparated) format for such cases, which allows setting custom escaping rules, delimiters, line separators, and starting/ending symbols.
312
312
313
313
Suppose we have the following data in the file:
314
314
@@ -341,7 +341,7 @@ LIMIT 3
341
341
└───────────────────────────┴────────────┴─────┘
342
342
```
343
343
344
-
We can also use [CustomSeparatedWithNames](/interfaces/formats.md/#customseparatedwithnames) to get headers exported and imported correctly. Explore [regex and template](templates-regex.md) formats to deal with even more complex cases.
344
+
We can also use [CustomSeparatedWithNames](/interfaces/formats/CustomSeparatedWithNames) to get headers exported and imported correctly. Explore [regex and template](templates-regex.md) formats to deal with even more complex cases.
345
345
346
346
## Working with large CSV files {#working-with-large-csv-files}
Or we can use [`JSONCompactEachRow`](/interfaces/formats#jsoncompacteachrow) to save disk space by skipping column names:
22
+
Or we can use [`JSONCompactEachRow`](/interfaces/formats/JSONCompactEachRow) to save disk space by skipping column names:
23
23
24
24
```sql
25
25
SELECT*FROM sometable FORMAT JSONCompactEachRow
@@ -32,7 +32,7 @@ SELECT * FROM sometable FORMAT JSONCompactEachRow
32
32
33
33
## Overriding data types as strings {#overriding-data-types-as-strings}
34
34
35
-
ClickHouse respects data types and will export JSON accordingly to standards. But in cases where we need to have all values encoded as strings, we can use the [JSONStringsEachRow](/interfaces/formats.md/#jsonstringseachrow) format:
35
+
ClickHouse respects data types and will export JSON accordingly to standards. But in cases where we need to have all values encoded as strings, we can use the [JSONStringsEachRow](/interfaces/formats/JSONStringsEachRow) format:
36
36
37
37
```sql
38
38
SELECT*FROM sometable FORMAT JSONStringsEachRow
@@ -56,7 +56,7 @@ SELECT * FROM sometable FORMAT JSONCompactStringsEachRow
56
56
57
57
## Exporting metadata together with data {#exporting-metadata-together-with-data}
58
58
59
-
General [JSON](/interfaces/formats.md/#json) format, which is popular in apps, will export not only resulting data but column types and query stats:
59
+
General [JSON](/interfaces/formats/JSON) format, which is popular in apps, will export not only resulting data but column types and query stats:
60
60
61
61
```sql
62
62
SELECT*FROM sometable FORMAT JSON
@@ -93,7 +93,7 @@ SELECT * FROM sometable FORMAT JSON
93
93
}
94
94
```
95
95
96
-
The [JSONCompact](/interfaces/formats.md/#jsoncompact) format will print the same metadata but use a compacted form for the data itself:
96
+
The [JSONCompact](/interfaces/formats/JSONCompact) format will print the same metadata but use a compacted form for the data itself:
97
97
98
98
```sql
99
99
SELECT*FROM sometable FORMAT JSONCompact
@@ -127,11 +127,11 @@ SELECT * FROM sometable FORMAT JSONCompact
127
127
}
128
128
```
129
129
130
-
Consider [`JSONStrings`](/interfaces/formats.md/#jsonstrings) or [`JSONCompactStrings`](/interfaces/formats.md/#jsoncompactstrings) variants to encode all values as strings.
130
+
Consider [`JSONStrings`](/interfaces/formats/JSONStrings) or [`JSONCompactStrings`](/interfaces/formats/JSONCompactStrings) variants to encode all values as strings.
131
131
132
132
## Compact way to export JSON data and structure {#compact-way-to-export-json-data-and-structure}
133
133
134
-
A more efficient way to have data, as well as it's structure, is to use [`JSONCompactEachRowWithNamesAndTypes`](/interfaces/formats.md/#jsoncompacteachrowwithnamesandtypes) format:
134
+
A more efficient way to have data, as well as it's structure, is to use [`JSONCompactEachRowWithNamesAndTypes`](/interfaces/formats/JSONCompactEachRowWithNamesAndTypes) format:
135
135
136
136
```sql
137
137
SELECT*FROM sometable FORMAT JSONCompactEachRowWithNamesAndTypes
0 commit comments