Skip to content

Commit 1042362

Browse files
committed
Enhancement: expand metrics section in kafka-clickhouse-connect-sink.md.
1 parent 7399b3f commit 1042362

File tree

1 file changed

+79
-6
lines changed

1 file changed

+79
-6
lines changed

docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md

Lines changed: 79 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -309,20 +309,93 @@ For additional details check out the official [tutorial](https://docs.confluent.
309309

310310
ClickHouse Kafka Connect reports runtime metrics via [Java Management Extensions (JMX)](https://www.oracle.com/technical-resources/articles/javase/jmx.html). JMX is enabled in Kafka Connector by default.
311311

312-
ClickHouse Connect `MBeanName`:
312+
#### ClickHouse-Specific Metrics {#clickhouse-specific-metrics}
313+
314+
The connector exposes custom metrics via the following MBean name:
313315

314316
```java
315317
com.clickhouse:type=ClickHouseKafkaConnector,name=SinkTask{id}
316318
```
317319

318-
ClickHouse Kafka Connect reports the following metrics:
319-
320-
| Name | Type | Description |
321-
|----------------------|------|-----------------------------------------------------------------------------------------|
322-
| `receivedRecords` | long | The total number of records received. |
320+
| Metric Name | Type | Description |
321+
|-----------------------|------|-----------------------------------------------------------------------------------------|
322+
| `receivedRecords` | long | The total number of records received. |
323323
| `recordProcessingTime` | long | Total time in nanoseconds spent grouping and converting records to a unified structure. |
324324
| `taskProcessingTime` | long | Total time in nanoseconds spent processing and inserting data into ClickHouse. |
325325

326+
#### Kafka Producer/Consumer Metrics {#kafka-producer-consumer-metrics}
327+
328+
The connector exposes standard Kafka producer and consumer metrics that provide insights into data flow, throughput, and performance.
329+
330+
**Topic-Level Metrics:**
331+
- `records-sent-total`: Total number of records sent to the topic
332+
- `bytes-sent-total`: Total bytes sent to the topic
333+
- `record-send-rate`: Average rate of records sent per second
334+
- `byte-rate`: Average bytes sent per second
335+
- `compression-rate`: Compression ratio achieved
336+
337+
**Partition-Level Metrics:**
338+
- `records-sent-total`: Total records sent to the partition
339+
- `bytes-sent-total`: Total bytes sent to the partition
340+
- `records-lag`: Current lag in the partition
341+
- `records-lead`: Current lead in the partition
342+
- `replica-fetch-lag`: Lag information for replicas
343+
344+
**Node-Level Connection Metrics:**
345+
- `connection-creation-total`: Total connections created to the Kafka node
346+
- `connection-close-total`: Total connections closed
347+
- `request-total`: Total requests sent to the node
348+
- `response-total`: Total responses received from the node
349+
- `request-rate`: Average request rate per second
350+
- `response-rate`: Average response rate per second
351+
352+
These metrics help monitor:
353+
- **Throughput**: Track data ingestion rates
354+
- **Lag**: Identify bottlenecks and processing delays
355+
- **Compression**: Measure data compression efficiency
356+
- **Connection Health**: Monitor network connectivity and stability
357+
358+
#### Kafka Connect Framework Metrics {#kafka-connect-framework-metrics}
359+
360+
The connector integrates with the Kafka Connect framework and exposes metrics for task lifecycle and error tracking.
361+
362+
**Task Status Metrics:**
363+
- `task-count`: Total number of tasks in the connector
364+
- `running-task-count`: Number of tasks currently running
365+
- `paused-task-count`: Number of tasks currently paused
366+
- `failed-task-count`: Number of tasks that have failed
367+
- `destroyed-task-count`: Number of destroyed tasks
368+
- `unassigned-task-count`: Number of unassigned tasks
369+
370+
Task status values include: `running`, `paused`, `failed`, `destroyed`, `unassigned`
371+
372+
**Error Metrics:**
373+
- `deadletterqueue-produce-failures`: Number of failed DLQ writes
374+
- `deadletterqueue-produce-requests`: Total DLQ write attempts
375+
- `last-error-timestamp`: Timestamp of the last error
376+
- `records-skip-total`: Total number of records skipped due to errors
377+
- `records-retry-total`: Total number of records that were retried
378+
- `errors-total`: Total number of errors encountered
379+
380+
**Performance Metrics:**
381+
- `offset-commit-failures`: Number of failed offset commits
382+
- `offset-commit-avg-time-ms`: Average time for offset commits
383+
- `offset-commit-max-time-ms`: Maximum time for offset commits
384+
- `put-batch-avg-time-ms`: Average time to process a batch
385+
- `put-batch-max-time-ms`: Maximum time to process a batch
386+
- `source-record-poll-total`: Total records polled
387+
388+
#### Monitoring Best Practices {#monitoring-best-practices}
389+
390+
1. **Monitor Consumer Lag**: Track `records-lag` per partition to identify processing bottlenecks
391+
2. **Track Error Rates**: Watch `errors-total` and `records-skip-total` to detect data quality issues
392+
3. **Observe Task Health**: Monitor task status metrics to ensure tasks are running properly
393+
4. **Measure Throughput**: Use `records-send-rate` and `byte-rate` to track ingestion performance
394+
5. **Monitor Connection Health**: Check node-level connection metrics for network issues
395+
6. **Track Compression Efficiency**: Use `compression-rate` to optimize data transfer
396+
397+
For detailed JMX metric definitions and Prometheus integration, see the [jmx-export-connector.yml](https://github.com/ClickHouse/clickhouse-kafka-connect/blob/main/jmx-export-connector.yml) configuration file.
398+
326399
### Limitations {#limitations}
327400

328401
- Deletes are not supported.

0 commit comments

Comments
 (0)