Skip to content

Commit 12c7c46

Browse files
author
Paultagoras
committed
Add ignorePartitionsWhenBatching property information
1 parent bb9a7c9 commit 12c7c46

File tree

1 file changed

+27
-26
lines changed

1 file changed

+27
-26
lines changed

docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md

Lines changed: 27 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -89,33 +89,34 @@ To connect the ClickHouse Sink to the ClickHouse server, you need to provide:
8989

9090
The full table of configuration options:
9191

92-
| Property Name | Description | Default Value |
93-
|-------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------|
94-
| `hostname` (Required) | The hostname or IP address of the server | N/A |
95-
| `port` | The ClickHouse port - default is 8443 (for HTTPS in the cloud), but for HTTP (the default for self-hosted) it should be 8123 | `8443` |
96-
| `ssl` | Enable ssl connection to ClickHouse | `true` |
97-
| `jdbcConnectionProperties` | Connection properties when connecting to Clickhouse. Must start with `?` and joined by `&` between `param=value` | `""` |
98-
| `username` | ClickHouse database username | `default` |
99-
| `password` (Required) | ClickHouse database password | N/A |
100-
| `database` | ClickHouse database name | `default` |
101-
| `connector.class` (Required) | Connector Class(explicit set and keep as the default value) | `"com.clickhouse.kafka.connect.ClickHouseSinkConnector"` |
102-
| `tasks.max` | The number of Connector Tasks | `"1"` |
103-
| `errors.retry.timeout` | ClickHouse JDBC Retry Timeout | `"60"` |
104-
| `exactlyOnce` | Exactly Once Enabled | `"false"` |
105-
| `topics` (Required) | The Kafka topics to poll - topic names must match table names | `""` |
106-
| `key.converter` (Required* - See Description) | Set according to the types of your keys. Required here if you are passing keys (and not defined in worker config). | `"org.apache.kafka.connect.storage.StringConverter"` |
107-
| `value.converter` (Required* - See Description) | Set based on the type of data on your topic. Supported: - JSON, String, Avro or Protobuf formats. Required here if not defined in worker config. | `"org.apache.kafka.connect.json.JsonConverter"` |
108-
| `value.converter.schemas.enable` | Connector Value Converter Schema Support | `"false"` |
109-
| `errors.tolerance` | Connector Error Tolerance. Supported: none, all | `"none"` |
110-
| `errors.deadletterqueue.topic.name` | If set (with errors.tolerance=all), a DLQ will be used for failed batches (see [Troubleshooting](#troubleshooting)) | `""` |
111-
| `errors.deadletterqueue.context.headers.enable` | Adds additional headers for the DLQ | `""` |
112-
| `clickhouseSettings` | Comma-separated list of ClickHouse settings (e.g. "insert_quorum=2, etc...") | `""` |
113-
| `topic2TableMap` | Comma-separated list that maps topic names to table names (e.g. "topic1=table1, topic2=table2, etc...") | `""` |
114-
| `tableRefreshInterval` | Time (in seconds) to refresh the table definition cache | `0` |
92+
| Property Name | Description | Default Value |
93+
|-------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------|
94+
| `hostname` (Required) | The hostname or IP address of the server | N/A |
95+
| `port` | The ClickHouse port - default is 8443 (for HTTPS in the cloud), but for HTTP (the default for self-hosted) it should be 8123 | `8443` |
96+
| `ssl` | Enable ssl connection to ClickHouse | `true` |
97+
| `jdbcConnectionProperties` | Connection properties when connecting to Clickhouse. Must start with `?` and joined by `&` between `param=value` | `""` |
98+
| `username` | ClickHouse database username | `default` |
99+
| `password` (Required) | ClickHouse database password | N/A |
100+
| `database` | ClickHouse database name | `default` |
101+
| `connector.class` (Required) | Connector Class(explicit set and keep as the default value) | `"com.clickhouse.kafka.connect.ClickHouseSinkConnector"` |
102+
| `tasks.max` | The number of Connector Tasks | `"1"` |
103+
| `errors.retry.timeout` | ClickHouse JDBC Retry Timeout | `"60"` |
104+
| `exactlyOnce` | Exactly Once Enabled | `"false"` |
105+
| `topics` (Required) | The Kafka topics to poll - topic names must match table names | `""` |
106+
| `key.converter` (Required* - See Description) | Set according to the types of your keys. Required here if you are passing keys (and not defined in worker config). | `"org.apache.kafka.connect.storage.StringConverter"` |
107+
| `value.converter` (Required* - See Description) | Set based on the type of data on your topic. Supported: - JSON, String, Avro or Protobuf formats. Required here if not defined in worker config. | `"org.apache.kafka.connect.json.JsonConverter"` |
108+
| `value.converter.schemas.enable` | Connector Value Converter Schema Support | `"false"` |
109+
| `errors.tolerance` | Connector Error Tolerance. Supported: none, all | `"none"` |
110+
| `errors.deadletterqueue.topic.name` | If set (with errors.tolerance=all), a DLQ will be used for failed batches (see [Troubleshooting](#troubleshooting)) | `""` |
111+
| `errors.deadletterqueue.context.headers.enable` | Adds additional headers for the DLQ | `""` |
112+
| `clickhouseSettings` | Comma-separated list of ClickHouse settings (e.g. "insert_quorum=2, etc...") | `""` |
113+
| `topic2TableMap` | Comma-separated list that maps topic names to table names (e.g. "topic1=table1, topic2=table2, etc...") | `""` |
114+
| `tableRefreshInterval` | Time (in seconds) to refresh the table definition cache | `0` |
115115
| `keeperOnCluster` | Allows configuration of ON CLUSTER parameter for self-hosted instances (e.g. `ON CLUSTER clusterNameInConfigFileDefinition`) for exactly-once connect_state table (see [Distributed DDL Queries](/sql-reference/distributed-ddl) | `""` |
116-
| `bypassRowBinary` | Allows disabling use of RowBinary and RowBinaryWithDefaults for Schema-based data (Avro, Protobuf, etc.) - should only be used when data will have missing columns, and Nullable/Default are unacceptable | `"false"` |
117-
| `dateTimeFormats` | Date time formats for parsing DateTime64 schema fields, separated by `;` (e.g. `someDateField=yyyy-MM-dd HH:mm:ss.SSSSSSSSS;someOtherDateField=yyyy-MM-dd HH:mm:ss`). | `""` |
118-
| `tolerateStateMismatch` | Allows the connector to drop records "earlier" than the current offset stored AFTER_PROCESSING (e.g. if offset 5 is sent, and offset 250 was the last recorded offset) | `"false"` |
116+
| `bypassRowBinary` | Allows disabling use of RowBinary and RowBinaryWithDefaults for Schema-based data (Avro, Protobuf, etc.) - should only be used when data will have missing columns, and Nullable/Default are unacceptable | `"false"` |
117+
| `dateTimeFormats` | Date time formats for parsing DateTime64 schema fields, separated by `;` (e.g. `someDateField=yyyy-MM-dd HH:mm:ss.SSSSSSSSS;someOtherDateField=yyyy-MM-dd HH:mm:ss`). | `""` |
118+
| `tolerateStateMismatch` | Allows the connector to drop records "earlier" than the current offset stored AFTER_PROCESSING (e.g. if offset 5 is sent, and offset 250 was the last recorded offset) | `"false"` |
119+
| `ignorePartitionsWhenBatching` | Will ignore partition when collecting messages for insert (though only if exactlyOnce is `false`). Performance Note: The more connector tasks, the fewer kafka partitions assigned per task - this can mean diminishing returns. | `"false"` |
119120

120121
### Target Tables {#target-tables}
121122

0 commit comments

Comments
 (0)