Skip to content

Commit 81ec9d1

Browse files
authored
Merge pull request #4633 from Blargian/remove_unneeded_sections
Formats improvement: remove duplicate format listing
2 parents 578a3c2 + 15603c7 commit 81ec9d1

File tree

19 files changed

+141
-141
lines changed

19 files changed

+141
-141
lines changed

docs/best-practices/json_type.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ show_related_blogs: true
99
doc_type: 'reference'
1010
---
1111

12-
ClickHouse now offers a native JSON column type designed for semi-structured and dynamic data. It's important to clarify that **this is a column type, not a data format**—you can insert JSON into ClickHouse as a string or via supported formats like [JSONEachRow](/docs/interfaces/formats/JSONEachRow), but that does not imply using the JSON column type. Users should only use the JSON type when the structure of their data is dynamic, not when they simply happen to store JSON.
12+
ClickHouse now offers a native JSON column type designed for semi-structured and dynamic data. It's important to clarify that **this is a column type, not a data format**—you can insert JSON into ClickHouse as a string or via supported formats like [JSONEachRow](/interfaces/formats/JSONEachRow), but that does not imply using the JSON column type. Users should only use the JSON type when the structure of their data is dynamic, not when they simply happen to store JSON.
1313

1414
## When to use the JSON type {#when-to-use-the-json-type}
1515

docs/chdb/guides/querying-s3-bucket.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ To do this, we can use the [`s3` table function](/sql-reference/table-functions/
4949
If you pass just the bucket name it will throw an exception.
5050
:::
5151

52-
We're also going to use the [`One`](/interfaces/formats#data-format-one) input format so that the file isn't parsed, instead a single row is returned per file and we can access the file via the `_file` virtual column and the path via the `_path` virtual column.
52+
We're also going to use the [`One`](/interfaces/formats/One) input format so that the file isn't parsed, instead a single row is returned per file and we can access the file via the `_file` virtual column and the path via the `_path` virtual column.
5353

5454
```python
5555
import chdb

docs/integrations/data-ingestion/clickpipes/kinesis.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ You have familiarized yourself with the [ClickPipes intro](./index.md) and setup
8686
## Supported data formats {#supported-data-formats}
8787

8888
The supported formats are:
89-
- [JSON](../../../interfaces/formats.md/#json)
89+
- [JSON](/interfaces/formats/JSON)
9090

9191
## Supported data types {#supported-data-types}
9292

docs/integrations/data-ingestion/data-formats/arrow-avro-orc.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Apache has released multiple data formats actively used in analytics environment
1515

1616
ClickHouse supports reading and writing [Apache Avro](https://avro.apache.org/) data files, which are widely used in Hadoop systems.
1717

18-
To import from an [avro file](assets/data.avro), we should use [Avro](/interfaces/formats.md/#data-format-avro) format in the `INSERT` statement:
18+
To import from an [avro file](assets/data.avro), we should use [Avro](/interfaces/formats/Avro) format in the `INSERT` statement:
1919

2020
```sql
2121
INSERT INTO sometable
@@ -70,7 +70,7 @@ LIMIT 3;
7070

7171
### Avro messages in Kafka {#avro-messages-in-kafka}
7272

73-
When Kafka messages use Avro format, ClickHouse can read such streams using [AvroConfluent](/interfaces/formats.md/#data-format-avro-confluent) format and [Kafka](/engines/table-engines/integrations/kafka.md) engine:
73+
When Kafka messages use Avro format, ClickHouse can read such streams using [AvroConfluent](/interfaces/formats/AvroConfluent) format and [Kafka](/engines/table-engines/integrations/kafka.md) engine:
7474

7575
```sql
7676
CREATE TABLE some_topic_stream
@@ -87,7 +87,7 @@ kafka_format = 'AvroConfluent';
8787

8888
## Working with Arrow format {#working-with-arrow-format}
8989

90-
Another columnar format is [Apache Arrow](https://arrow.apache.org/), also supported by ClickHouse for import and export. To import data from an [Arrow file](assets/data.arrow), we use the [Arrow](/interfaces/formats.md/#data-format-arrow) format:
90+
Another columnar format is [Apache Arrow](https://arrow.apache.org/), also supported by ClickHouse for import and export. To import data from an [Arrow file](assets/data.arrow), we use the [Arrow](/interfaces/formats/Arrow) format:
9191

9292
```sql
9393
INSERT INTO sometable
@@ -107,7 +107,7 @@ Also, check [data types matching](/interfaces/formats/Arrow#data-types-matching)
107107

108108
### Arrow data streaming {#arrow-data-streaming}
109109

110-
The [ArrowStream](/interfaces/formats.md/#data-format-arrow-stream) format can be used to work with Arrow streaming (used for in-memory processing). ClickHouse can read and write Arrow streams.
110+
The [ArrowStream](/interfaces/formats/ArrowStream) format can be used to work with Arrow streaming (used for in-memory processing). ClickHouse can read and write Arrow streams.
111111

112112
To demonstrate how ClickHouse can stream Arrow data, let's pipe it to the following python script (it reads input stream in Arrow streaming format and outputs the result as a Pandas table):
113113

@@ -140,7 +140,7 @@ We've used `arrow-stream` as a possible source of Arrow streaming data.
140140

141141
## Importing and exporting ORC data {#importing-and-exporting-orc-data}
142142

143-
[Apache ORC](https://orc.apache.org/) format is a columnar storage format typically used for Hadoop. ClickHouse supports importing as well as exporting [Orc data](assets/data.orc) using [ORC format](/interfaces/formats.md/#data-format-orc):
143+
[Apache ORC](https://orc.apache.org/) format is a columnar storage format typically used for Hadoop. ClickHouse supports importing as well as exporting [Orc data](assets/data.orc) using [ORC format](/interfaces/formats/ORC):
144144

145145
```sql
146146
SELECT *

docs/integrations/data-ingestion/data-formats/binary.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ We're going to use some_data [table](assets/some_data.sql) and [data](assets/som
1616

1717
## Exporting in a Native ClickHouse format {#exporting-in-a-native-clickhouse-format}
1818

19-
The most efficient data format to export and import data between ClickHouse nodes is [Native](/interfaces/formats.md/#native) format. Exporting is done using `INTO OUTFILE` clause:
19+
The most efficient data format to export and import data between ClickHouse nodes is [Native](/interfaces/formats/Native) format. Exporting is done using `INTO OUTFILE` clause:
2020

2121
```sql
2222
SELECT * FROM some_data
@@ -74,7 +74,7 @@ FORMAT Native
7474

7575
## Exporting to RowBinary {#exporting-to-rowbinary}
7676

77-
Another binary format supported is [RowBinary](/interfaces/formats.md/#rowbinary), which allows importing and exporting data in binary-represented rows:
77+
Another binary format supported is [RowBinary](/interfaces/formats/RowBinary), which allows importing and exporting data in binary-represented rows:
7878

7979
```sql
8080
SELECT * FROM some_data
@@ -101,7 +101,7 @@ LIMIT 5
101101
└────────────────────────────────┴────────────┴──────┘
102102
```
103103

104-
Consider using [RowBinaryWithNames](/interfaces/formats.md/#rowbinarywithnames), which also adds a header row with a columns list. [RowBinaryWithNamesAndTypes](/interfaces/formats.md/#rowbinarywithnamesandtypes) will also add an additional header row with column types.
104+
Consider using [RowBinaryWithNames](/interfaces/formats/RowBinaryWithNames), which also adds a header row with a columns list. [RowBinaryWithNamesAndTypes](/interfaces/formats/RowBinaryWithNamesAndTypes) will also add an additional header row with column types.
105105

106106
### Importing from RowBinary files {#importing-from-rowbinary-files}
107107
To load data from a RowBinary file, we can use a `FROM INFILE` clause:
@@ -115,7 +115,7 @@ FORMAT RowBinary
115115
## Importing single binary value using RawBLOB {#importing-single-binary-value-using-rawblob}
116116

117117
Suppose we want to read an entire binary file and save it into a field in a table.
118-
This is the case when the [RawBLOB format](/interfaces/formats.md/#rawblob) can be used. This format can be directly used with a single-column table only:
118+
This is the case when the [RawBLOB format](/interfaces/formats/RawBLOB) can be used. This format can be directly used with a single-column table only:
119119

120120
```sql
121121
CREATE TABLE images(data String) ENGINE = Memory
@@ -152,7 +152,7 @@ Note that we had to use `LIMIT 1` because exporting more than a single value wil
152152

153153
## MessagePack {#messagepack}
154154

155-
ClickHouse supports importing and exporting to [MessagePack](https://msgpack.org/) using the [MsgPack](/interfaces/formats.md/#msgpack). To export to MessagePack format:
155+
ClickHouse supports importing and exporting to [MessagePack](https://msgpack.org/) using the [MsgPack](/interfaces/formats/MsgPack). To export to MessagePack format:
156156

157157
```sql
158158
SELECT *
@@ -173,7 +173,7 @@ FORMAT MsgPack
173173

174174
<CloudNotSupportedBadge/>
175175

176-
To work with [Protocol Buffers](/interfaces/formats.md/#protobuf) we first need to define a [schema file](assets/schema.proto):
176+
To work with [Protocol Buffers](/interfaces/formats/Protobuf) we first need to define a [schema file](assets/schema.proto):
177177

178178
```protobuf
179179
syntax = "proto3";
@@ -185,7 +185,7 @@ message MessageType {
185185
};
186186
```
187187

188-
Path to this schema file (`schema.proto` in our case) is set in a `format_schema` settings option for the [Protobuf](/interfaces/formats.md/#protobuf) format:
188+
Path to this schema file (`schema.proto` in our case) is set in a `format_schema` settings option for the [Protobuf](/interfaces/formats/Protobuf) format:
189189

190190
```sql
191191
SELECT * FROM some_data
@@ -194,7 +194,7 @@ FORMAT Protobuf
194194
SETTINGS format_schema = 'schema:MessageType'
195195
```
196196

197-
This saves data to the [proto.bin](assets/proto.bin) file. ClickHouse also supports importing Protobuf data as well as nested messages. Consider using [ProtobufSingle](/interfaces/formats.md/#protobufsingle) to work with a single Protocol Buffer message (length delimiters will be omitted in this case).
197+
This saves data to the [proto.bin](assets/proto.bin) file. ClickHouse also supports importing Protobuf data as well as nested messages. Consider using [ProtobufSingle](/interfaces/formats/ProtobufSingle) to work with a single Protocol Buffer message (length delimiters will be omitted in this case).
198198

199199
## Cap'n Proto {#capn-proto}
200200

@@ -212,7 +212,7 @@ struct PathStats {
212212
}
213213
```
214214

215-
Now we can import and export using [CapnProto](/interfaces/formats.md/#capnproto) format and this schema:
215+
Now we can import and export using [CapnProto](/interfaces/formats/CapnProto) format and this schema:
216216

217217
```sql
218218
SELECT

docs/integrations/data-ingestion/data-formats/csv-tsv.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ To import data from the [CSV file](assets/data_small.csv) to the `sometable` tab
3131
clickhouse-client -q "INSERT INTO sometable FORMAT CSV" < data_small.csv
3232
```
3333

34-
Note that we use [FORMAT CSV](/interfaces/formats.md/#csv) to let ClickHouse know we're ingesting CSV formatted data. Alternatively, we can load data from a local file using the [FROM INFILE](/sql-reference/statements/insert-into.md/#inserting-data-from-a-file) clause:
34+
Note that we use [FORMAT CSV](/interfaces/formats/CSV) to let ClickHouse know we're ingesting CSV formatted data. Alternatively, we can load data from a local file using the [FROM INFILE](/sql-reference/statements/insert-into.md/#inserting-data-from-a-file) clause:
3535

3636
```sql
3737
INSERT INTO sometable
@@ -59,7 +59,7 @@ head data-small-headers.csv
5959
"Aegithina_tiphia","2018-02-01",34
6060
```
6161

62-
To import data from this file, we can use [CSVWithNames](/interfaces/formats.md/#csvwithnames) format:
62+
To import data from this file, we can use [CSVWithNames](/interfaces/formats/CSVWithNames) format:
6363

6464
```bash
6565
clickhouse-client -q "INSERT INTO sometable FORMAT CSVWithNames" < data_small_headers.csv
@@ -153,17 +153,17 @@ SELECT * FROM file('nulls.csv')
153153

154154
## TSV (tab-separated) files {#tsv-tab-separated-files}
155155

156-
Tab-separated data format is widely used as a data interchange format. To load data from a [TSV file](assets/data_small.tsv) to ClickHouse, the [TabSeparated](/interfaces/formats.md/#tabseparated) format is used:
156+
Tab-separated data format is widely used as a data interchange format. To load data from a [TSV file](assets/data_small.tsv) to ClickHouse, the [TabSeparated](/interfaces/formats/TabSeparated) format is used:
157157

158158
```bash
159159
clickhouse-client -q "INSERT INTO sometable FORMAT TabSeparated" < data_small.tsv
160160
```
161161

162-
There's also a [TabSeparatedWithNames](/interfaces/formats.md/#tabseparatedwithnames) format to allow working with TSV files that have headers. And, like for CSV, we can skip the first X lines using the [input_format_tsv_skip_first_lines](/operations/settings/settings-formats.md/#input_format_tsv_skip_first_lines) option.
162+
There's also a [TabSeparatedWithNames](/interfaces/formats/TabSeparatedWithNames) format to allow working with TSV files that have headers. And, like for CSV, we can skip the first X lines using the [input_format_tsv_skip_first_lines](/operations/settings/settings-formats.md/#input_format_tsv_skip_first_lines) option.
163163

164164
### Raw TSV {#raw-tsv}
165165

166-
Sometimes, TSV files are saved without escaping tabs and line breaks. We should use [TabSeparatedRaw](/interfaces/formats.md/#tabseparatedraw) to handle such files.
166+
Sometimes, TSV files are saved without escaping tabs and line breaks. We should use [TabSeparatedRaw](/interfaces/formats/TabSeparatedRaw) to handle such files.
167167

168168
## Exporting to CSV {#exporting-to-csv}
169169

@@ -183,7 +183,7 @@ FORMAT CSV
183183
"2016_Greater_Western_Sydney_Giants_season","2017-05-01",86
184184
```
185185

186-
To add a header to the CSV file, we use the [CSVWithNames](/interfaces/formats.md/#csvwithnames) format:
186+
To add a header to the CSV file, we use the [CSVWithNames](/interfaces/formats/CSVWithNames) format:
187187

188188
```sql
189189
SELECT *
@@ -273,7 +273,7 @@ All column types will be treated as a `String` in this case.
273273

274274
### Exporting and importing CSV with explicit column types {#exporting-and-importing-csv-with-explicit-column-types}
275275

276-
ClickHouse also allows explicitly setting column types when exporting data using [CSVWithNamesAndTypes](/interfaces/formats.md/#csvwithnamesandtypes) (and other *WithNames formats family):
276+
ClickHouse also allows explicitly setting column types when exporting data using [CSVWithNamesAndTypes](/interfaces/formats/CSVWithNamesAndTypes) (and other *WithNames formats family):
277277

278278
```sql
279279
SELECT *
@@ -308,7 +308,7 @@ Now ClickHouse identifies column types based on a (second) header row instead of
308308

309309
## Custom delimiters, separators, and escaping rules {#custom-delimiters-separators-and-escaping-rules}
310310

311-
In sophisticated cases, text data can be formatted in a highly custom manner but still have a structure. ClickHouse has a special [CustomSeparated](/interfaces/formats.md/#format-customseparated) format for such cases, which allows setting custom escaping rules, delimiters, line separators, and starting/ending symbols.
311+
In sophisticated cases, text data can be formatted in a highly custom manner but still have a structure. ClickHouse has a special [CustomSeparated](/interfaces/formats/CustomSeparated) format for such cases, which allows setting custom escaping rules, delimiters, line separators, and starting/ending symbols.
312312

313313
Suppose we have the following data in the file:
314314

@@ -341,7 +341,7 @@ LIMIT 3
341341
└───────────────────────────┴────────────┴─────┘
342342
```
343343

344-
We can also use [CustomSeparatedWithNames](/interfaces/formats.md/#customseparatedwithnames) to get headers exported and imported correctly. Explore [regex and template](templates-regex.md) formats to deal with even more complex cases.
344+
We can also use [CustomSeparatedWithNames](/interfaces/formats/CustomSeparatedWithNames) to get headers exported and imported correctly. Explore [regex and template](templates-regex.md) formats to deal with even more complex cases.
345345

346346
## Working with large CSV files {#working-with-large-csv-files}
347347

docs/integrations/data-ingestion/data-formats/json/exporting.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ doc_type: 'guide'
88

99
# Exporting JSON
1010

11-
Almost any JSON format used for import can be used for export as well. The most popular is [`JSONEachRow`](/interfaces/formats.md/#jsoneachrow):
11+
Almost any JSON format used for import can be used for export as well. The most popular is [`JSONEachRow`](/interfaces/formats/JSONEachRow):
1212

1313
```sql
1414
SELECT * FROM sometable FORMAT JSONEachRow
@@ -19,7 +19,7 @@ SELECT * FROM sometable FORMAT JSONEachRow
1919
{"path":"Ahmadabad-e_Kalij-e_Sofla","month":"2017-01-01","hits":3}
2020
```
2121

22-
Or we can use [`JSONCompactEachRow`](/interfaces/formats#jsoncompacteachrow) to save disk space by skipping column names:
22+
Or we can use [`JSONCompactEachRow`](/interfaces/formats/JSONCompactEachRow) to save disk space by skipping column names:
2323

2424
```sql
2525
SELECT * FROM sometable FORMAT JSONCompactEachRow
@@ -32,7 +32,7 @@ SELECT * FROM sometable FORMAT JSONCompactEachRow
3232

3333
## Overriding data types as strings {#overriding-data-types-as-strings}
3434

35-
ClickHouse respects data types and will export JSON accordingly to standards. But in cases where we need to have all values encoded as strings, we can use the [JSONStringsEachRow](/interfaces/formats.md/#jsonstringseachrow) format:
35+
ClickHouse respects data types and will export JSON accordingly to standards. But in cases where we need to have all values encoded as strings, we can use the [JSONStringsEachRow](/interfaces/formats/JSONStringsEachRow) format:
3636

3737
```sql
3838
SELECT * FROM sometable FORMAT JSONStringsEachRow
@@ -56,7 +56,7 @@ SELECT * FROM sometable FORMAT JSONCompactStringsEachRow
5656

5757
## Exporting metadata together with data {#exporting-metadata-together-with-data}
5858

59-
General [JSON](/interfaces/formats.md/#json) format, which is popular in apps, will export not only resulting data but column types and query stats:
59+
General [JSON](/interfaces/formats/JSON) format, which is popular in apps, will export not only resulting data but column types and query stats:
6060

6161
```sql
6262
SELECT * FROM sometable FORMAT JSON
@@ -93,7 +93,7 @@ SELECT * FROM sometable FORMAT JSON
9393
}
9494
```
9595

96-
The [JSONCompact](/interfaces/formats.md/#jsoncompact) format will print the same metadata but use a compacted form for the data itself:
96+
The [JSONCompact](/interfaces/formats/JSONCompact) format will print the same metadata but use a compacted form for the data itself:
9797

9898
```sql
9999
SELECT * FROM sometable FORMAT JSONCompact
@@ -127,11 +127,11 @@ SELECT * FROM sometable FORMAT JSONCompact
127127
}
128128
```
129129

130-
Consider [`JSONStrings`](/interfaces/formats.md/#jsonstrings) or [`JSONCompactStrings`](/interfaces/formats.md/#jsoncompactstrings) variants to encode all values as strings.
130+
Consider [`JSONStrings`](/interfaces/formats/JSONStrings) or [`JSONCompactStrings`](/interfaces/formats/JSONCompactStrings) variants to encode all values as strings.
131131

132132
## Compact way to export JSON data and structure {#compact-way-to-export-json-data-and-structure}
133133

134-
A more efficient way to have data, as well as it's structure, is to use [`JSONCompactEachRowWithNamesAndTypes`](/interfaces/formats.md/#jsoncompacteachrowwithnamesandtypes) format:
134+
A more efficient way to have data, as well as it's structure, is to use [`JSONCompactEachRowWithNamesAndTypes`](/interfaces/formats/JSONCompactEachRowWithNamesAndTypes) format:
135135

136136
```sql
137137
SELECT * FROM sometable FORMAT JSONCompactEachRowWithNamesAndTypes

0 commit comments

Comments
 (0)