Skip to content

Commit 1e9bc8d

Browse files
authored
Merge pull request #295 from AxonIQ/enhancement/288
[#288] Add Token Stealing subsection to Tracking Tokens
2 parents 331225d + a34f4d6 commit 1e9bc8d

File tree

1 file changed

+73
-4
lines changed

1 file changed

+73
-4
lines changed

axon-framework/events/event-processors/streaming.md

Lines changed: 73 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -377,13 +377,13 @@ To be able to reopen the stream at a later point, we should keep the progress so
377377
The progress is kept by updating and saving the `TrackingToken` after handling batches of events.
378378
Keeping the progress requires CRUD operation, for which the Streaming Processor uses the [`TokenStore`](#token-store).
379379

380-
For a Streaming Processor to process any events, it needs "a claim" on a `TrackingToken`.
380+
For a Streaming Processor to process any events, it needs ["a claim"](#token-claims) on a `TrackingToken`.
381381
The processor will update this claim every time it has finished handling a batch of events.
382382
This so-called "claim extension" is, just as updating and saving of tokens, delegated to the Token Store.
383383
Hence, the Streaming Processors achieves collaboration among instances/threads through token claims.
384384

385385
In the absence of a claim, a processor will actively try to retrieve one.
386-
If a token claim is not extended for a configurable time window, other processor threads are able to "steal" the claim.
386+
If a token claim is not extended for a configurable amount of time, other processor threads can ["steal"](#token-stealing) the claim.
387387
Token stealing can, for example, happen if event processing is slow or encountered some exceptions.
388388

389389
### Initial Tracking Token
@@ -419,7 +419,7 @@ In those cases, we recommend to validate if any of the above situations occurred
419419
There are a couple of things we can configure when it comes to tokens.
420420
We can separate these options in "initial token" and "token claim" configuration, as described in the following sections:
421421

422-
**Initial Token**
422+
#### Initial Token
423423

424424
The [initial token](#initial-tracking-token) for a `StreamingEventProcessor` is configurable for every processor instance.
425425
When configuring the initial token builder function, the received input parameter is the `StreamableMessageSource`.
@@ -501,7 +501,7 @@ public class AxonConfig {
501501
{% endtab %}
502502
{% endtabs %}
503503

504-
**Token Claims**
504+
#### Token Claims
505505

506506
As described [here](#tracking-tokens), a streaming processor should claim a token before it is allowed to perform any processing work.
507507
There are several scenarios where a processor may keep the claim for too long.
@@ -587,6 +587,75 @@ public class AxonConfig {
587587
{% endtab %}
588588
{% endtabs %}
589589

590+
#### Token Stealing
591+
592+
As described at the [start](#tracking-tokens), streaming processor threads can "steal" tokens from one another.
593+
A token is "stolen" when a thread loses a [token claim](#token-claims).
594+
Situations like this internally result in an `UnableToClaimTokenException,` caught by both streaming event processor implementations and translated into warn- or info-level log statements.
595+
596+
Where the framework uses token claims to ensure that a single thread is processing a sequence of events, it supports token stealing to guarantee event processing is not blocked forever.
597+
In short, the framework uses token stealing to unblock your streaming processor threads when processing takes too long.
598+
Examples may include literal slow processing, blocking exceptional scenarios, and deadlocks.
599+
600+
However, token stealing may occur as a surprise for some applications, making it an unwanted side effect.
601+
As such, it is good to be aware of why tokens get stolen (as described above), but also when this happens and what the consequences are.
602+
603+
##### When is a Token stolen?
604+
605+
In practical terms, a token is stolen whenever the _claim timeout_ is exceeded.
606+
607+
This timeout is met whenever the token's timestamp (e.g., the `timestamp` column of your `token_entry` table) exceeds the `claimTimeout` of the `TokenStore`.
608+
By default, the `claimTimeout` value equals 10 seconds.
609+
To adjust it, you must configure a `TokenStore` instance through its builder, as shown in the [Token Store](#token-store) section.
610+
611+
The token's timestamp is equally crucial in deciding when the timeout is met.
612+
The streaming processor thread holding the claim is in charge of updating the token timestamp.
613+
This timestamp is updated whenever the thread finishes a batch of events or whenever the processor extends the claim.
614+
When to extend a claim differs between the Tracking and Pooled Streaming processor.
615+
You should check out the [token claim](#token-claims) section if you want to know how to configure these values.
616+
617+
To further clarify, a streaming processor's thread needs to be able to update the token claim and, by extension, the timestamp to ensure it won't get stolen.
618+
Hence, a staling processor thread will, one way or another, eventually lose the claim.
619+
620+
Examples of when a thread may get its token stolen are:
621+
- Overall slow event handling
622+
- Too large event batch size
623+
- Blocking operations inside event handlers
624+
- Blocking exceptions inside event handlers
625+
626+
##### What are the consequences of Token stealing?
627+
628+
The consequence of token stealing is that an event may be handled twice (or more).
629+
630+
When a thread steals a token, the original thread was _already_ processing events from the token's position.
631+
To protect against doubling event handling, Axon Framework will combine committing the event handling task with updating the token.
632+
As the token claim is required to update the token, the original thread will fail the update.
633+
Following this, a rollback occurs on the [Unit of Work](/axon-framework/messaging-concepts/unit-of-work.md), resolving most issues arising from token stealing.
634+
635+
The ability to rollback event handling tasks sheds light on the consequences of token stealing.
636+
Most event processors project events into a projection stored within a database.
637+
Furthermore, if you store the projection in the same database as the token, the rollback will ensure the change is not persisted.
638+
Thus, the consequence of token stealing is limited to wasting processor cycles.
639+
This scenario is why we recommend storing tokens and projections in the same database.
640+
641+
If a rollback is out of the question for an event handling task, we strongly recommend making the task idempotent.
642+
You may have this scenario when, for example, the projection and tokens do not reside in the same database.
643+
or when the event handler dispatches an operation (e.g., through the `CommandGateway`).
644+
In making the invoked operation idempotent, you ensure that whenever the thread stealing a token handles an event twice (or more), the outcome will be identical.
645+
646+
Without idempotency, the consequences of token stealing can be manyfold:
647+
- Your projection (stored in a different database than your tokens!) may incorrectly project the state.
648+
- An event handler putting messages on a queue will put a message on the queue again.
649+
- A Saga Event Handler invoking a third-party service will invoke that service again.
650+
- An event handler sending an email will send that email again.
651+
652+
In short, any operation introducing a side effect that isn't handled in an idempotent fashion will occur again when a token is stolen.
653+
654+
Concluding, we can separate the consequence of token stealing into roughly three scenarios:
655+
1. We can rollback the operation. In this case, the only consequence is wasted processor cycles.
656+
2. The operation is idempotent. In this case, the only consequence is wasted processor cycles.
657+
3. When the task cannot be rolled back nor performed in an idempotent fashion, compensating actions may be the way out.
658+
590659
### Token Store
591660

592661
The `TokenStore` provides the CRUD operations for the `StreamingEventProcessor` to interact with `TrackingTokens`.

0 commit comments

Comments
 (0)