Skip to content

Commit ab21da2

Browse files
committed
add beam parameters and link dataflow parameters to it
1 parent a522b2c commit ab21da2

File tree

2 files changed

+43
-25
lines changed

2 files changed

+43
-25
lines changed

docs/en/integrations/data-ingestion/etl-tools/apache-beam.md

Lines changed: 38 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -97,31 +97,44 @@ public class Main {
9797

9898
## Supported Data Types
9999

100-
| ClickHouse | Apache Beam | Is Supported | Notes |
101-
|--------------------------------------|------------------------------|--------------|----------------------------------------------------------------------------------------------------------------------------------------|
102-
| `TableSchema.TypeName.FLOAT32` | `Schema.TypeName#FLOAT` || |
103-
| `TableSchema.TypeName.FLOAT64` | `Schema.TypeName#DOUBLE` || |
104-
| `TableSchema.TypeName.INT8` | `Schema.TypeName#BYTE` || |
105-
| `TableSchema.TypeName.INT16` | `Schema.TypeName#INT16` || |
106-
| `TableSchema.TypeName.INT32` | `Schema.TypeName#INT32` || |
107-
| `TableSchema.TypeName.INT64` | `Schema.TypeName#INT64` || |
108-
| `TableSchema.TypeName.STRING` | `Schema.TypeName#STRING` || |
109-
| `TableSchema.TypeName.UINT8` | `Schema.TypeName#INT16` || |
110-
| `TableSchema.TypeName.UINT16` | `Schema.TypeName#INT32` || |
111-
| `TableSchema.TypeName.UINT32` | `Schema.TypeName#INT64` || |
112-
| `TableSchema.TypeName.UINT64` | `Schema.TypeName#INT64` || |
113-
| `TableSchema.TypeName.DATE` | `Schema.TypeName#DATETIME` || |
114-
| `TableSchema.TypeName.DATETIME` | `Schema.TypeName#DATETIME` || |
115-
| `TableSchema.TypeName.ARRAY` | `Schema.TypeName#ARRAY` || |
116-
| `TableSchema.TypeName.ENUM8` | `Schema.TypeName#STRING` || |
117-
| `TableSchema.TypeName.ENUM16` | `Schema.TypeName#STRING` || |
118-
| `TableSchema.TypeName.BOOL` | `Schema.TypeName#BOOLEAN` || |
119-
| `TableSchema.TypeName.TUPLE` | `Schema.TypeName#ROW` || |
120-
| `TableSchema.TypeName.FIXEDSTRING` | `FixedBytes` || `FixedBytes` is a `LogicalType` representing a fixed-length <br/> byte array located at <br/> `org.apache.beam.sdk.schemas.logicaltypes` |
121-
| | `Schema.TypeName#DECIMAL` || |
122-
| | `Schema.TypeName#MAP` || |
123-
124-
100+
| ClickHouse | Apache Beam | Is Supported | Notes |
101+
|------------------------------------|----------------------------|--------------|------------------------------------------------------------------------------------------------------------------------------------------|
102+
| `TableSchema.TypeName.FLOAT32` | `Schema.TypeName#FLOAT` || |
103+
| `TableSchema.TypeName.FLOAT64` | `Schema.TypeName#DOUBLE` || |
104+
| `TableSchema.TypeName.INT8` | `Schema.TypeName#BYTE` || |
105+
| `TableSchema.TypeName.INT16` | `Schema.TypeName#INT16` || |
106+
| `TableSchema.TypeName.INT32` | `Schema.TypeName#INT32` || |
107+
| `TableSchema.TypeName.INT64` | `Schema.TypeName#INT64` || |
108+
| `TableSchema.TypeName.STRING` | `Schema.TypeName#STRING` || |
109+
| `TableSchema.TypeName.UINT8` | `Schema.TypeName#INT16` || |
110+
| `TableSchema.TypeName.UINT16` | `Schema.TypeName#INT32` || |
111+
| `TableSchema.TypeName.UINT32` | `Schema.TypeName#INT64` || |
112+
| `TableSchema.TypeName.UINT64` | `Schema.TypeName#INT64` || |
113+
| `TableSchema.TypeName.DATE` | `Schema.TypeName#DATETIME` || |
114+
| `TableSchema.TypeName.DATETIME` | `Schema.TypeName#DATETIME` || |
115+
| `TableSchema.TypeName.ARRAY` | `Schema.TypeName#ARRAY` || |
116+
| `TableSchema.TypeName.ENUM8` | `Schema.TypeName#STRING` || |
117+
| `TableSchema.TypeName.ENUM16` | `Schema.TypeName#STRING` || |
118+
| `TableSchema.TypeName.BOOL` | `Schema.TypeName#BOOLEAN` || |
119+
| `TableSchema.TypeName.TUPLE` | `Schema.TypeName#ROW` || |
120+
| `TableSchema.TypeName.FIXEDSTRING` | `FixedBytes` || `FixedBytes` is a `LogicalType` representing a fixed-length <br/> byte array located at <br/> `org.apache.beam.sdk.schemas.logicaltypes` |
121+
| | `Schema.TypeName#DECIMAL` || |
122+
| | `Schema.TypeName#MAP` || |
123+
124+
## ClickHouseIO.Write Parameters
125+
126+
You can adjust the `ClickHouseIO.Write` configuration with the following setter functions:
127+
128+
| Parameter Setter Function | Argument Type | Default Value | Description |
129+
|-----------------------------|-----------------------------|-------------------------------|-----------------------------------------------------------------|
130+
| `withMaxInsertBlockSize` | `(long maxInsertBlockSize)` | `1000000` | Maximum size of a block of rows to insert. |
131+
| `withMaxRetries` | `(int maxRetries)` | `5` | Maximum number of retries for failed inserts. |
132+
| `withMaxCumulativeBackoff` | `(Duration maxBackoff)` | `Duration.standardDays(1000)` | Maximum cumulative backoff duration for retries. |
133+
| `withInitialBackoff` | `(Duration initialBackoff)` | `Duration.standardSeconds(5)` | Initial backoff duration before the first retry. |
134+
| `withInsertDistributedSync` | `(Boolean sync)` | `true` | If true, synchronizes insert operations for distributed tables. |
135+
| `withInsertQuorum` | `(Long quorum)` | `null` | The number of replicas required to confirm an insert operation. |
136+
| `withInsertDeduplicate` | `(Boolean deduplicate)` | `true` | If true, deduplication is enabled for insert operations. |
137+
| `withTableSchema` | `(TableSchema schema)` | `null` | Schema of the target ClickHouse table. |
125138

126139
## Limitations
127140

docs/en/integrations/data-ingestion/google-dataflow/templates/bigquery-to-clickhouse.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,11 @@ The template can either read the entire table or read specific records using a p
4444
| `queryTempDataset` | Set an existing dataset to create the temporary table to store the results of the query. For example, `temp_dataset`. | | |
4545
| `KMSEncryptionKey` | If reading from BigQuery using the query source, use this Cloud KMS key to encrypt any temporary tables created. For example, `projects/your-project/locations/global/keyRings/your-keyring/cryptoKeys/your-key`. | | |
4646

47+
48+
:::note
49+
All `ClickHouseIO` parameters default values could be found in [`ClickHouseIO` Apache Beam Connector](/docs/en/integrations/apache-beam#clickhouseiowrite-parameters)
50+
:::
51+
4752
## Source and Target Tables Schema
4853

4954
In order to effectively load the BigQuery dataset to ClickHouse, and a column infestation process is conducted with the

0 commit comments

Comments
 (0)