Kafka 4.x Queue Semantics support #189

Shekharrajak · 2025-09-01T09:16:16Z

Ref https://issues.apache.org/jira/browse/FLINK-38287

This implementation adds Kafka 4.x share group semantics to Flink's Kafka connector while maintaining full backward compatibility with existing code. The code changes are following KIP-932 and FLIP-27 main source architecture and implicit mode acknowledgement.

This directly addresses use cases where:

Multiple consumers need to process items efficiently in parallel from a single/multiple topic(s).
Messages need explicit acknowledgment/release (to avoid reprocessing or allow retries).
Use cases where scaling Flink ML/LLM workload is critical - Shifting Kafka coordination and assignment logic to the broker side would simplify today’s complex Flink source management, making consumption more efficient, scalable, and far less error-prone.
Operational Benefits

Higher Throughput: ShareGroupHeartbeat helps in Queue-like workloads, maximum throughput scenarios. Share groups distribute messages at the record level, not partition level, so multiple readers can consume from the same topic with Kafka coordinating message distribution.
Better Availability and Flexible Scaling: consumers assignment logic is simpler in server side and rebalancing frequency is minimised.

Note:

Semantic Guarantee
AT-MOST-ONCE: Records acknowledged immediately after polling. Job failure after acknowledgment but
before checkpoint = permanent data loss.

Ref https://issues.apache.org/jira/browse/KAFKA-19883 implemented for no data loss in commit : ef83f90

Good for: Logs, monitoring, non-critical analyticsNot for: Financial transactions, critical data,
audit trails

boring-cyborg · 2025-09-01T09:16:19Z

Thanks for opening this pull request! Please check out our contributing guidelines. (https://flink.apache.org/contributing/how-to-contribute.html)

Shekharrajak · 2025-09-01T09:18:37Z

flink-connector-kafka/pom.xml

 			<groupId>org.apache.flink</groupId>
 			<artifactId>flink-streaming-java</artifactId>
-			<scope>provided</scope>
+			<scope>compile</scope>


These changes can be reverted.

Shekharrajak · 2025-09-01T09:20:12Z

.../src/test/java/org/apache/flink/connector/kafka/source/KafkaShareGroupSourceBuilderTest.java

+ * <p>This test validates builder functionality, error handling, and property management
+ * for Kafka share group source construction.
+ */
+@DisplayName("KafkaShareGroupSourceBuilder Tests")


Testcases improvements is required, I will check and update them accordingly.

Shekharrajak · 2025-09-01T09:21:00Z

pom.xml

 		<confluent.version>7.9.2</confluent.version>
 		<flink.version>2.0.0</flink.version>
-		<kafka.version>3.9.1</kafka.version>
+		<kafka.version>4.1.0</kafka.version>


This is not yet released and expected to be available to download. Meantime the testing is done by adding this into class path.

Shekharrajak · 2025-09-01T09:26:03Z

...fka/src/main/java/org/apache/flink/connector/kafka/source/reader/ShareGroupBatchManager.java

+ * Manages batches of records from Kafka share consumer for checkpoint persistence.
+ * Controls when new batches can be fetched to work within share consumer's auto-commit constraints.
+ */
+public class ShareGroupBatchManager<K, V> 


ListCheckpointed will help to store the records which is polled but not yet processed in Flink persistent checkpoint state - this will make sure in case of any failure / crash we process the records that we read & ack.

Shekharrajak · 2025-09-02T05:37:54Z

Using Flink SQL, we can have some validation :

CREATE TABLE kafka_share_source (
      message STRING
  ) WITH (
      'connector' = 'kafka-sharegroup',
      'bootstrap.servers' = 'localhost:9092',
      'share-group-id' = 'flink-sql-test-group',
      'topic' = 'test-topic',
      'format' = 'raw',
      'source.parallelism' = '4'  -- 4 subtasks regardless of partition count
  );
 
select * from kafka_share_source;

davidradl · 2025-09-05T08:45:12Z

pom.xml

-		<flink.version>2.1.0</flink.version>
-		<kafka.version>4.0.0</kafka.version>
+		<flink.version>2.0.0</flink.version>
+		<kafka.version>4.1.0</kafka.version>


PR 190 is also upping the Kafka client level to 4.1. If we do it here then we should amend the NOTOCE as per pr 190. fyi @tomncooper

jnh5y · 2025-09-16T14:26:47Z

As a high-level note, since share groups do not use transactions, there will be some possibility for reprocessing messages. Is that ok for your use cases?

Generally, do you have any performance numbers to show that this consumer is faster? (Of course, since transactions are not available, I could imagine it being a little bit faster anyhow...)

Shekharrajak · 2025-11-07T18:55:50Z

Flink-SQL:

 //  {"order_id":1001,"customer_id":"C001","quantity":2,"price":25.50}
   CREATE TABLE orders_share_group (
      order_id BIGINT,
      customer_id STRING,
      quantity INT,
      price DECIMAL(10, 2)
  ) WITH (
      'connector' = 'kafka-sharegroup',
      'topic' = 'orders',
      'bootstrap.servers' = 'localhost:9092',
      'share-group-id' = 'flink-orders-sharegroup',
      'format' = 'json'

  );

  select * from orders_share_group;

Note that this update will make the connector incompatible with Kafka clusters running Kafka version 2.0 and older. Signed-off-by: Thomas Cooper <code@tomcooper.dev>

KafkaEnumerator's state contains the TopicPartitions only but not the offsets, so it doesn't contain the full split state contrary to the design intent. There are a couple of issues with that approach. It implicitly assumes that splits are fully assigned to readers before the first checkpoint. Else the enumerator will invoke the offset initializer again on recovery from such a checkpoint leading to inconsistencies (LATEST may be initialized during the first attempt for some partitions and initialized during second attempt for others). Through addSplitBack callback, you may also get these scenarios later for BATCH which actually leads to duplicate rows (in case of EARLIEST or SPECIFIC-OFFSETS) or data loss (in case of LATEST). Finally, it's not possible to safely use KafkaSource as part of a HybridSource because the offset initializer cannot even be recreated on recovery. All cases are solved by also retaining the offset in the enumerator state. To that end, this commit merges the async discovery phases to immediately initialize the splits from the partitions. Any subsequent checkpoint will contain the proper start offset.

boring-cyborg bot added the component=Connectors/Kafka label Sep 1, 2025

Shekharrajak commented Sep 1, 2025

View reviewed changes

Shekharrajak changed the title ~~[WIP] Kafka 4.x Queue Semantics support in Flink Connector Kafka~~ [WIP] Kafka 4.x Queue Semantics support Sep 1, 2025

davidradl reviewed Sep 5, 2025

View reviewed changes

Shekharrajak changed the title ~~[WIP] Kafka 4.x Queue Semantics support~~ Kafka 4.x Queue Semantics support Nov 7, 2025

Shekharrajak force-pushed the feature/kafka4 branch from 997b91e to 2cdbb46 Compare November 11, 2025 19:36

boring-cyborg bot added component=BuildSystem component=Documentation labels Nov 11, 2025

Shekharrajak force-pushed the feature/kafka4 branch from 7c5d908 to ef83f90 Compare November 15, 2025 08:41

Shekharrajak and others added 7 commits November 19, 2025 09:27

intial updates for kafka 4.1.0

3ea61e6

[FLINK-37583] Upgrade to Kafka 4.0.0 client.

2121597

Note that this update will make the connector incompatible with Kafka clusters running Kafka version 2.0 and older. Signed-off-by: Thomas Cooper <code@tomcooper.dev>

[FLINK-38289] Update to Flink 2.1

5484e74

major update - at most once semantics

860aabd

update with txn acknowledgement KAFKA-19883

8bff769

revert back to existing implementation: ack and commitSync

e25cc95

Shekharrajak force-pushed the feature/kafka4 branch from c381123 to e25cc95 Compare November 19, 2025 03:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Kafka 4.x Queue Semantics support #189

Kafka 4.x Queue Semantics support #189

Uh oh!

Shekharrajak commented Sep 1, 2025 •

edited

Loading

Uh oh!

boring-cyborg bot commented Sep 1, 2025

Uh oh!

Shekharrajak Sep 1, 2025

Uh oh!

Shekharrajak Sep 1, 2025

Uh oh!

Shekharrajak Sep 1, 2025

Uh oh!

Shekharrajak Sep 1, 2025

Uh oh!

Shekharrajak commented Sep 2, 2025

Uh oh!

davidradl Sep 5, 2025

Uh oh!

jnh5y commented Sep 16, 2025

Uh oh!

Shekharrajak commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Kafka 4.x Queue Semantics support #189

Are you sure you want to change the base?

Kafka 4.x Queue Semantics support #189

Uh oh!

Conversation

Shekharrajak commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

boring-cyborg bot commented Sep 1, 2025

Uh oh!

Shekharrajak Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

Shekharrajak Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

Shekharrajak Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

Shekharrajak Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

Shekharrajak commented Sep 2, 2025

Uh oh!

davidradl Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

jnh5y commented Sep 16, 2025

Uh oh!

Shekharrajak commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Shekharrajak commented Sep 1, 2025 •

edited

Loading