From 5cfa68e7a93d9da1c7a8b45a5f81c65a9b3b26a5 Mon Sep 17 00:00:00 2001 From: Anton Rubin Date: Mon, 10 Nov 2025 17:26:32 +0000 Subject: [PATCH 1/4] adding the new config for aws lambda and defaults Signed-off-by: Anton Rubin --- .../configuration/processors/aws-lambda.md | 35 ++++++++++--------- 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/_data-prepper/pipelines/configuration/processors/aws-lambda.md b/_data-prepper/pipelines/configuration/processors/aws-lambda.md index 65a2f0a1855..63a7b4ed224 100644 --- a/_data-prepper/pipelines/configuration/processors/aws-lambda.md +++ b/_data-prepper/pipelines/configuration/processors/aws-lambda.md @@ -20,27 +20,29 @@ You can configure the processor using the following configuration options. Field | Type | Required | Description -------------------- | ------- | -------- | ---------------------------------------------------------------------------- -`function_name` | String | Required | The name of the AWS Lambda function to invoke. +`function_name` | String | Required | The name of the AWS Lambda function to invoke. Default is `none`. `invocation_type` | String | Required | Specifies the invocation type, either `request-response` or `event`. Default is `request-response`. -`aws.region` | String | Required | The AWS Region in which the Lambda function is located. -`aws.sts_role_arn` | String | Optional | The Amazon Resource Name (ARN) of the role to assume before invoking the Lambda function. +`aws.region` | String | Required | The AWS Region in which the Lambda function is located. Default is `none`. +`aws.sts_role_arn` | String | Optional | The Amazon Resource Name (ARN) of the role to assume before invoking the Lambda function. Default is `none`. `max_retries` | Integer | Optional | The maximum number of retries for failed invocations. Default is `3`. -`batch` | Object | Optional | The batch settings for the Lambda invocations. Default is `key_name = "events"`. Default threshold is `event_count=100`, `maximum_size="5mb"`, and `event_collect_timeout = 10s`. -`lambda_when` | String | Optional | A conditional expression that determines when to invoke the Lambda processor. +`batch` | Object | Optional | The batch settings for the Lambda invocations. Default is `key_name = "events"`, threshold `event_count=100`, `maximum_size="5mb"`, and `event_collect_timeout = 10s`. +`lambda_when` | String | Optional | A conditional expression that determines when to invoke the Lambda processor. Default is `none`. `response_codec` | Object | Optional | A codec configuration for parsing Lambda responses. Default is `json`. -`tags_on_match_failure` | List | Optional | A list of tags to add to events when Lambda matching fails or encounters an unexpected error. -`sdk_timeout` | Duration| Optional | Configures the SDK's client connection timeout period. Default is `60s`. +`tags_on_match_failure` | List | Optional | A list of tags to add to events when Lambda matching fails or encounters an unexpected error. Default is `[]`. `response_events_match` | Boolean | Optional | Specifies how Data Prepper interprets and processes Lambda function responses. Default is `false`. `client` | Object | Optional | The client configuration. -`api_call_timeout` | Duration | Optional | The amount of time that the SDK maintains the API call before timing out, in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations). -`base_delay` | Duration | Optional | The base delay for exponential backoff, in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations). -`connection_timeout` | Duration | Optional | The amount of time that the SDK maintains the connection to the client before timing out, in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations). -`max_backoff` | Duration | Optional | The maximum backoff time for exponential backoff, in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations). -`max_concurrency` | Integer | Optional | The maximum concurrency defined on the client side. -`max_retries` | Integer | Optional | The maximum number of retries before failing. +`api_call_timeout` | Duration | Optional | The amount of time that the SDK maintains the API call before timing out, in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations). Default is `60s`. +`base_delay` | Duration | Optional | The base delay for exponential backoff, in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations). Default is `100ms`. +`connection_timeout` | Duration | Optional | The amount of time that the SDK maintains the connection to the client before timing out, in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations). Default is `60s`. +`max_backoff` | Duration | Optional | The maximum backoff time for exponential backoff, in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations). Default is `20s`. +`max_concurrency` | Integer | Optional | The maximum concurrency defined on the client side. Default is `200`. +`max_retries` | Integer | Optional | The maximum number of retries before failing. Default is `3`. +`circuit_breaker_retries` | Integer | Optional | The number of retry attempts to perform when the circuit breaker is open before resuming normal processing. Default is `0`. +`circuit_breaker_wait_interval` | Integer | Optional | The wait interval, in milliseconds, between circuit breaker retries. Default is `1000`. +## Example configuration -#### Example configuration +The following is example configuration: ``` processors: @@ -61,8 +63,9 @@ processors: event_count: 100 maximum_size: "5mb" event_collect_timeout: PT10S + circuit_breaker_retries: 30 + circuit_breaker_wait_interval: 1000 lambda_when: "event['status'] == 'process'" - ``` {% include copy.html %} @@ -101,4 +104,4 @@ Integration tests for this plugin are executed separately from the main Data Pre ``` ./gradlew :data-prepper-plugins:aws-lambda:integrationTest -Dtests.processor.lambda.region="us-east-1" -Dtests.processor.lambda.functionName="lambda_test_function" -Dtests.processor.lambda.sts_role_arn="arn:aws:iam::123456789012:role/dataprepper-role ``` -{% include copy.html %} +{% include copy.html %} \ No newline at end of file From db8e3acc2e566080a74c3c8c0e8392e2ea5f2b84 Mon Sep 17 00:00:00 2001 From: Anton Rubin Date: Mon, 10 Nov 2025 17:39:00 +0000 Subject: [PATCH 2/4] removing invocation_type #9872 Signed-off-by: Anton Rubin --- .../pipelines/configuration/processors/aws-lambda.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/_data-prepper/pipelines/configuration/processors/aws-lambda.md b/_data-prepper/pipelines/configuration/processors/aws-lambda.md index 63a7b4ed224..286300bb6b3 100644 --- a/_data-prepper/pipelines/configuration/processors/aws-lambda.md +++ b/_data-prepper/pipelines/configuration/processors/aws-lambda.md @@ -20,8 +20,7 @@ You can configure the processor using the following configuration options. Field | Type | Required | Description -------------------- | ------- | -------- | ---------------------------------------------------------------------------- -`function_name` | String | Required | The name of the AWS Lambda function to invoke. Default is `none`. -`invocation_type` | String | Required | Specifies the invocation type, either `request-response` or `event`. Default is `request-response`. +`function_name` | String | Required | The name of the AWS Lambda function to invoke. Default is `none`. `aws.region` | String | Required | The AWS Region in which the Lambda function is located. Default is `none`. `aws.sts_role_arn` | String | Optional | The Amazon Resource Name (ARN) of the role to assume before invoking the Lambda function. Default is `none`. `max_retries` | Integer | Optional | The maximum number of retries for failed invocations. Default is `3`. @@ -48,7 +47,6 @@ The following is example configuration: processors: - aws_lambda: function_name: "my-lambda-function" - invocation_type: "request-response" response_events_match: false client: connection_timeout: PT5M From 7fd38532aad4a10c7fd27d66260f064ddf6a26fe Mon Sep 17 00:00:00 2001 From: Anton Rubin Date: Wed, 12 Nov 2025 13:24:41 +0000 Subject: [PATCH 3/4] addressing PR comments Signed-off-by: Anton Rubin --- .../configuration/processors/aws-lambda.md | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/_data-prepper/pipelines/configuration/processors/aws-lambda.md b/_data-prepper/pipelines/configuration/processors/aws-lambda.md index 286300bb6b3..393e52a2df5 100644 --- a/_data-prepper/pipelines/configuration/processors/aws-lambda.md +++ b/_data-prepper/pipelines/configuration/processors/aws-lambda.md @@ -20,17 +20,17 @@ You can configure the processor using the following configuration options. Field | Type | Required | Description -------------------- | ------- | -------- | ---------------------------------------------------------------------------- -`function_name` | String | Required | The name of the AWS Lambda function to invoke. Default is `none`. -`aws.region` | String | Required | The AWS Region in which the Lambda function is located. Default is `none`. -`aws.sts_role_arn` | String | Optional | The Amazon Resource Name (ARN) of the role to assume before invoking the Lambda function. Default is `none`. +`function_name` | String | Required | The name of the AWS Lambda function to invoke. +`aws.region` | String | Required | The AWS Region in which the Lambda function is located. If no configuration is provided, value in `data-prepper-config.yaml` file is used. If still not found, default AWS credentials on the host are used. +`aws.sts_role_arn` | String | Optional | The Amazon Resource Name (ARN) of the role to assume before invoking the Lambda function. If no configuration is provided, value in `data-prepper-config.yaml` file is used. If still not found, default AWS credentials on the host are used. `max_retries` | Integer | Optional | The maximum number of retries for failed invocations. Default is `3`. `batch` | Object | Optional | The batch settings for the Lambda invocations. Default is `key_name = "events"`, threshold `event_count=100`, `maximum_size="5mb"`, and `event_collect_timeout = 10s`. -`lambda_when` | String | Optional | A conditional expression that determines when to invoke the Lambda processor. Default is `none`. +`lambda_when` | String | Optional | A conditional expression that determines when to invoke the Lambda processor. By default all events are processed. `response_codec` | Object | Optional | A codec configuration for parsing Lambda responses. Default is `json`. `tags_on_match_failure` | List | Optional | A list of tags to add to events when Lambda matching fails or encounters an unexpected error. Default is `[]`. `response_events_match` | Boolean | Optional | Specifies how Data Prepper interprets and processes Lambda function responses. Default is `false`. `client` | Object | Optional | The client configuration. -`api_call_timeout` | Duration | Optional | The amount of time that the SDK maintains the API call before timing out, in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations). Default is `60s`. +`api_call_timeout` | Duration | Optional | The total time that the SDK will attempt connection, including all retries. A Data Prepper duration can be expressed either as full [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations) or as a simplified form in seconds or milliseconds, for example `60s` or `60000ms`. Default is `60s`. `base_delay` | Duration | Optional | The base delay for exponential backoff, in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations). Default is `100ms`. `connection_timeout` | Duration | Optional | The amount of time that the SDK maintains the connection to the client before timing out, in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations). Default is `60s`. `max_backoff` | Duration | Optional | The maximum backoff time for exponential backoff, in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations). Default is `20s`. @@ -46,20 +46,20 @@ The following is example configuration: ``` processors: - aws_lambda: - function_name: "my-lambda-function" + function_name: my-lambda-function response_events_match: false client: connection_timeout: PT5M api_call_timeout: PT5M aws: - region: "us-east-1" - sts_role_arn: "arn:aws:iam::123456789012:role/my-lambda-role" + region: us-east-1 + sts_role_arn: arn:aws:iam::123456789012:role/my-lambda-role max_retries: 3 batch: - key_name: "events" + key_name: events threshold: event_count: 100 - maximum_size: "5mb" + maximum_size: 5mb event_collect_timeout: PT10S circuit_breaker_retries: 30 circuit_breaker_wait_interval: 1000 From f13a26d315095681603abc9455cb59139f0c2506 Mon Sep 17 00:00:00 2001 From: AntonEliatra Date: Tue, 18 Nov 2025 11:47:17 +0000 Subject: [PATCH 4/4] Update aws-lambda.md Signed-off-by: AntonEliatra --- .../pipelines/configuration/processors/aws-lambda.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/_data-prepper/pipelines/configuration/processors/aws-lambda.md b/_data-prepper/pipelines/configuration/processors/aws-lambda.md index c4e12ed753b..1d0ea617fdc 100644 --- a/_data-prepper/pipelines/configuration/processors/aws-lambda.md +++ b/_data-prepper/pipelines/configuration/processors/aws-lambda.md @@ -20,8 +20,7 @@ You can configure the processor using the following configuration options. Field | Type | Required | Description -------------------- | ------- | -------- | ---------------------------------------------------------------------------- -`function_name` | String | Required | The name of the AWS Lambda function to invoke. Must be 3--500 characters. -`invocation_type` | String | Optional | Specifies the invocation type, either `request-response` or `event`. Default is `request-response`. +`function_name` | String | Required | The name of the AWS Lambda function to invoke. Must be 3--500 characters. `aws.region` | String | Required | The AWS Region in which the Lambda function is located. `aws.sts_role_arn` | String | Optional | The Amazon Resource Name (ARN) of the role to assume before invoking the Lambda function. Must be 20--2048 characters. `aws.sts_external_id` | String | Optional | An external ID for STS role assumption. Must be 2--1224 characters. @@ -134,4 +133,4 @@ Integration tests for this plugin are executed separately from the main Data Pre ```bash ./gradlew :data-prepper-plugins:aws-lambda:integrationTest -Dtests.processor.lambda.region="us-east-1" -Dtests.processor.lambda.functionName="lambda_test_function" -Dtests.processor.lambda.sts_role_arn="arn:aws:iam::123456789012:role/dataprepper-role ``` -{% include copy.html %} \ No newline at end of file +{% include copy.html %}