-
Notifications
You must be signed in to change notification settings - Fork 54
HTTP-182 HTTP sink retries #181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… compatible constants, deprecate old constants, consolidate HTTP Code evaluation logic
| "failed requests: {}, throwing BatchHttpStatusCodeValidationFailedException from sink", | ||
| failedRequests | ||
| ); | ||
| getFatalExceptionCons().accept(new BatchHttpStatusCodeValidationFailedException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, the README reads like an error should trigger a job failure, which is why I threw a fatal exception here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest that you raise an issue for each of the 3 things you would like to put in. It will be far easier to review smaller pieces.
- the retries for sink should be very self contained piece of code.
- for the ratelimitter -I see we have apache/flink#27134 which is not yet merged. Is there a generic way we can do Sink in the Flink code base ?
- fyi - we are porting over changes to https://github.com/apache/flink-connector-http and are hoping to release this in the near future
README.md
Outdated
| You can configure HTTP status code handling for HTTP sink table and enable automatic retries with delivery guarantees. | ||
|
|
||
| #### Retries and delivery guarantee | ||
| HTTP Sink supports automatic retries when `sink.delivery-guarantee` is set to `at-least-once`. Failed requests will be automatically retried based on the configured status codes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sink.delivery-guarantee is interesting and looks like it is a bringing this connector in line with others. When do we think we will get more that one request issued? I assume we would need more processing for exactly once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am lifting a big part of the code from an earlier PR, which was the source of using a delivery guarantee. However, for my personal use case, I do in fact need to guarantee at least once behavior 🙂. The current implementation has an indefinite number of retries. If we don't have the delivery guarantee, we'll need to have something like number of attempts imo. I'm fine with whatever though! I just really need retries!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly once is pretty challenging given that we are using HTTP requests. Imo, it would require coordination with the receiving service to response with some kind of ack in order to enforce it. I don't believe exactly once can be achieved on just the client side with HTTP requests if that makes sense. However, duplicated success message emissions should be really infrequent
That's great to hear this could make it in the Apache foundation! You probably meant the config changes, the rate limiting, and the retry behavior 🤦 . I can do that |
src/main/java/com/getindata/connectors/http/BatchHttpStatusCodeValidationFailedException.java
Outdated
Show resolved
Hide resolved
src/main/java/com/getindata/connectors/http/internal/SinkHttpClientResponse.java
Outdated
Show resolved
Hide resolved
| @Slf4j | ||
| public class HttpSinkWriter<InputT> extends AsyncSinkWriter<InputT, HttpSinkRequestEntry> { | ||
|
|
||
| private static final int AIMD_RATE_LIMITING_STRATEGY_INCREASE_RATE = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it is worth making these parameters configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but given #181 (review) concerns. I would like to just remove the retry strategy from the pull request all-together. A retry policy is almost always present if retries are a thing, however, I can understand @davidradl's concerns and I'm hesitant to block retries due to retry policy implementation if that makes sense. Please let me know your thoughts!
… compatible constants, deprecate old constants, consolidate HTTP Code evaluation logic
…e refactor, and switched default to At-Least-Once
Description
Attempting to re-vitalize the http sink updates. Branch ended up getting bigger than I hoped, but please provide any feedback you have!
Resolves
<issue nr here>PR Checklist