Skip to content

Commit f186dee

Browse files
authored
Merge pull request #2481 from sandykumar93/main
New serverless pattern - Lambda response streaming from DynamoDB
2 parents e902537 + 443f04e commit f186dee

File tree

8 files changed

+404
-0
lines changed

8 files changed

+404
-0
lines changed
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
# AWS Lambda Response streaming: Streaming incremental Amazon DynamoDB Query results.
2+
3+
This pattern shows how to use Lambda response streaming to incrementally retrieve and stream DynamoDB query / scan results using the write() method. Instead of waiting for the entire query / scan operation to complete, the Lambda function streams data in batches by setting a limit on the number of items per query and sending each batch as soon as it's retrieved. This improves the time-to-first-byte (TTFB) by streaming results to the client as they become available.
4+
5+
For more information on the Lambda response streaming feature, see the [launch blog post](https://aws.amazon.com/blogs/compute/introducing-aws-lambda-response-streaming/).
6+
7+
Important: this application uses various AWS services and there are costs associated with these services after the Free Tier usage - please see the [AWS Pricing page](https://aws.amazon.com/pricing/) for details. You are responsible for any AWS costs incurred. No warranty is implied in this example.
8+
9+
## Requirements
10+
11+
- [Create an AWS account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) if you do not already have one and log in. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources.
12+
- [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) installed and configured
13+
- [Git Installed](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
14+
- [AWS Serverless Application Model](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html) (AWS SAM) installed
15+
16+
## Deployment Instructions
17+
18+
1. Create a new directory, navigate to that directory in a terminal and clone the GitHub repository:
19+
20+
```
21+
git clone https://github.com/aws-samples/serverless-patterns
22+
```
23+
24+
1. Change directory to the pattern directory:
25+
26+
```
27+
cd lambda-streaming-ttfb-write-sam-with-dynamodb
28+
```
29+
30+
1. From the command line, use AWS SAM to deploy the AWS resources for the pattern as specified in the template.yml file:
31+
32+
```
33+
sam deploy -g --stack-name lambda-streaming-ttfb-write-sam-with-dynamodb
34+
```
35+
36+
1. During the prompts:
37+
38+
- Enter a stack name
39+
- Enter the desired AWS Region
40+
- Allow SAM CLI to create IAM roles with the required permissions.
41+
42+
1. After running `sam deploy --guided` mode once and saving arguments to a configuration file `samconfig.toml`, you can use `sam deploy` in future to use these defaults.
43+
44+
AWS SAM deploys a Lambda function with streaming support and a function URL
45+
46+
![AWS SAM deploy --g](https://d2908q01vomqb2.cloudfront.net/1b6453892473a467d07372d45eb05abc2031647a/2023/03/31/AWS-SAM-deploy-g.png)
47+
48+
The AWS SAM output returns a Lambda function URL.
49+
50+
![AWS SAM resources](https://d2908q01vomqb2.cloudfront.net/1b6453892473a467d07372d45eb05abc2031647a/2023/03/31/AWS-SAM-resources.png)
51+
52+
## How it works
53+
54+
The service interaction in this pattern uses AWS Lambda's response streaming capability to stream data in batches from Amazon DynamoDB. Instead of retrieving all query / scan results at once, the Lambda function processes the query incrementally, retrieving and sending results as they become available. Here are the details of the interaction:
55+
56+
1. Client Request
57+
A client sends a request to the Lambda URL, asking for specific data from a DynamoDB table.
58+
2. Lambda Function Initialization :
59+
When the Lambda function is invoked, it initializes a connection to DynamoDB and prepares to run a query operation. The query includes a limit parameter, which restricts the number of items retrieved in each batch.
60+
3. Querying
61+
The Lambda function begins querying DynamoDB using the defined limit (e.g., 100 items per batch). DynamoDB will return only a limited set of results instead of the entire result set at once.
62+
4. Response Streaming
63+
As soon as a batch of results is retrieved, the function uses Lambda's streaming API `write()` method to send the data to the client. This happens immediately, without waiting for the entire query operation to complete.
64+
5. Pagination Handling
65+
If DynamoDB returns a LastEvaluatedKey (which indicates that more data is available), the Lambda function automatically continues querying the next batch of data. Each batch is streamed to the client as it becomes available.
66+
6. Final Response
67+
The Lambda function continues this process, retrieving a batch from DynamoDB and streaming it to the client until all data is fetched. Once DynamoDB returns no more data (i.e., no LastEvaluatedKey), the function sends the final batch and closes the stream.
68+
69+
## Testing
70+
71+
1. Run the data dump function to populate the DynamoDB table.
72+
73+
Use curl with your AWS credentials as the url uses AWS Identity and Access Management (IAM) for authorization. Replace the URL and Region parameters for your deployment.
74+
75+
```
76+
curl --request GET https://<url-of-data-dump-lambda>.lambda-url.<Region>.on.aws/ --user AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY --aws-sigv4 'aws:amz:<Region>:lambda'
77+
```
78+
79+
2. Run the streaming function to view the streaming response.
80+
81+
Use curl with your AWS credentials to view the streaming response as the url uses AWS Identity and Access Management (IAM) for authorization. Replace the URL and Region parameters for your deployment.
82+
83+
```
84+
curl --request GET https://<url-of-streaming-lambda>.lambda-url.<Region>.on.aws/ --user AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY --aws-sigv4 'aws:amz:<Region>:lambda'
85+
```
86+
87+
You can see the gradual display of the streamed response.
88+
89+
## Cleanup
90+
91+
1. Delete the stack, Enter `Y` to confirm deleting the stack and folder.
92+
```
93+
sam delete
94+
```
95+
96+
---
97+
98+
Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
99+
100+
SPDX-License-Identifier: MIT-0
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
{
2+
"title": "AWS Lambda Response streaming: Streaming incremental Amazon DynamoDB Query results.",
3+
"description": "This pattern uses AWS Lambda response streaming to incrementally retrieve and stream Amazon DynamoDB results in batches, reducing time-to-first-byte (TTFB) by sending data as it's retrieved.",
4+
"language": "nodejs",
5+
"level": "200",
6+
"framework": "SAM",
7+
"introBox": {
8+
"headline": "How it works",
9+
"text": [
10+
"The service interaction in this pattern uses AWS Lambda's response streaming capability to stream data in batches from Amazon DynamoDB. Instead of retrieving all query / scan results at once, the Lambda function processes the query incrementally, retrieving and sending results as they become available."
11+
]
12+
},
13+
"gitHub": {
14+
"template": {
15+
"repoURL": "https://github.com/aws-samples/serverless-patterns/tree/main/lambda-streaming-ttfb-write-sam-with-dynamodb",
16+
"templateURL": "serverless-patterns/lambda-streaming-ttfb-write-sam-with-dynamodb",
17+
"projectFolder": "lambda-streaming-ttfb-write-sam-with-dynamodb",
18+
"templateFile": "lambda-streaming-ttfb-write-sam-with-dynamodb/src/template.yaml"
19+
}
20+
},
21+
"resources": {
22+
"bullets": [
23+
{
24+
"text": "Supercharging User Experience with AWS Lambda Response Streaming",
25+
"link": "https://aws.amazon.com/blogs/apn/supercharging-user-experience-with-aws-lambda-response-streaming/"
26+
},
27+
{
28+
"text": "Configure Lambda Function Response Streaming",
29+
"link": "https://docs.aws.amazon.com/lambda/latest/dg/configuration-response-streaming.html"
30+
},
31+
{
32+
"text": "DynamoDB Read and Write Operations",
33+
"link": "https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/read-write-operations.html#read-operation-consumption"
34+
}
35+
]
36+
},
37+
"deploy": {
38+
"text": ["sam deploy"]
39+
},
40+
"testing": {
41+
"text": ["See the GitHub repo for detailed testing instructions."]
42+
},
43+
"cleanup": {
44+
"text": ["Delete the stack: <code>sam delete</code>."]
45+
},
46+
"authors": [
47+
{
48+
"name": "Sandeep Kumar P",
49+
"image": "https://media.licdn.com/dms/image/v2/D5603AQFs4Yt815MOaw/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1695457883755?e=1732147200&v=beta&t=C-bNWZXdsiHnCh4n3S377BXlhMQVl1fl-iJKkwkwDpU",
50+
"bio": "Principal Solutions Architect at AntStack",
51+
"linkedin": "https://www.linkedin.com/in/sandykumar93/"
52+
}
53+
]
54+
}
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
{
2+
"title": "AWS Lambda Response streaming: Amazon DynamoDB Query results.",
3+
"description": "This pattern uses AWS Lambda response streaming to stream Amazon DynamoDB results in batches, reducing time-to-first-byte (TTFB).",
4+
"language": "Node.js",
5+
"level": "200",
6+
"framework": "SAM",
7+
"introBox": {
8+
"headline": "How it works",
9+
"text": [
10+
"The service interaction in this pattern uses AWS Lambda's response streaming capability to stream data in batches from Amazon DynamoDB. Instead of retrieving all results at once, the Lambda function processes the query incrementally, retrieving and sending results as they become available."
11+
]
12+
},
13+
"gitHub": {
14+
"template": {
15+
"repoURL": "https://github.com/aws-samples/serverless-patterns/tree/main/lambda-streaming-ttfb-write-sam-with-dynamodb",
16+
"templateURL": "serverless-patterns/lambda-streaming-ttfb-write-sam-with-dynamodb",
17+
"projectFolder": "lambda-streaming-ttfb-write-sam-with-dynamodb",
18+
"templateFile": "template.yaml"
19+
}
20+
},
21+
"resources": {
22+
"bullets": [
23+
{
24+
"text": "Supercharging User Experience with AWS Lambda Response Streaming",
25+
"link": "https://aws.amazon.com/blogs/apn/supercharging-user-experience-with-aws-lambda-response-streaming/"
26+
},
27+
{
28+
"text": "Configure Lambda Function Response Streaming",
29+
"link": "https://docs.aws.amazon.com/lambda/latest/dg/configuration-response-streaming.html"
30+
},
31+
{
32+
"text": "DynamoDB Read and Write Operations",
33+
"link": "https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/read-write-operations.html#read-operation-consumption"
34+
}
35+
]
36+
},
37+
"deploy": {
38+
"text": [
39+
"sam deploy"
40+
]
41+
},
42+
"testing": {
43+
"text": [
44+
"See the GitHub repo for detailed testing instructions."
45+
]
46+
},
47+
"cleanup": {
48+
"text": [
49+
"Delete the stack: <code>sam delete</code>."
50+
]
51+
},
52+
"authors": [
53+
{
54+
"name": "Sandeep Kumar P",
55+
"image": "https://media.licdn.com/dms/image/v2/D5603AQFs4Yt815MOaw/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1695457883755?e=1732147200&v=beta&t=C-bNWZXdsiHnCh4n3S377BXlhMQVl1fl-iJKkwkwDpU",
56+
"bio": "Principal Solutions Architect at AntStack",
57+
"linkedin": "sandykumar93"
58+
}
59+
],
60+
"patternArch": {
61+
"icon1": {
62+
"x": 30,
63+
"y": 50,
64+
"service": "lambda",
65+
"label": "AWS Lambda"
66+
},
67+
"icon2": {
68+
"x": 60,
69+
"y": 50,
70+
"service": "dynamodb",
71+
"label": "Amazon DynamoDB"
72+
},
73+
"line1": {
74+
"from": "icon1",
75+
"to": "icon2",
76+
"label": "Query"
77+
}
78+
}
79+
}
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Data Source Information
2+
3+
The data source is a gzipped JSON file containing a list of items. The file is zipped from a subset of [Spotify Dataset](https://www.kaggle.com/datasets/yamaerenay/spotify-dataset-19212020-600k-tracks?select=tracks.csv) dataset containing 2000 items.
Binary file not shown.
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
import fs from "fs";
2+
import zlib from "zlib";
3+
import process from "process";
4+
import { pipeline } from "stream";
5+
import { promisify } from "util";
6+
import { DynamoDBClient, PutItemCommand } from "@aws-sdk/client-dynamodb";
7+
import { marshall } from "@aws-sdk/util-dynamodb";
8+
const dynamodb = new DynamoDBClient();
9+
const tableName = process.env.DDB_TABLE_NAME;
10+
11+
const pipelineAsync = promisify(pipeline);
12+
13+
// Function to read and unzip the .gz file directly
14+
const readAndUnzipJson = async (gzFilePath) => {
15+
const gunzip = zlib.createGunzip();
16+
const input = fs.createReadStream(gzFilePath);
17+
let data = "";
18+
await pipelineAsync(input, gunzip, async (source) => {
19+
for await (const chunk of source) {
20+
data += chunk;
21+
}
22+
});
23+
return JSON.parse(data);
24+
};
25+
26+
// Function to dump JSON data to DynamoDB
27+
const dumpToDynamoDB = async (jsonData) => {
28+
const promises = jsonData.map(async (item) => {
29+
console.log(`Dumping item ${JSON.stringify(item)}`);
30+
const command = new PutItemCommand({
31+
TableName: tableName,
32+
Item: marshall(item),
33+
});
34+
return dynamodb.send(command);
35+
});
36+
await Promise.all(promises);
37+
};
38+
39+
export const handler = async (event) => {
40+
try {
41+
const gzFilePath = "data.json.gz";
42+
const jsonData = await readAndUnzipJson(gzFilePath);
43+
await dumpToDynamoDB(jsonData);
44+
45+
return {
46+
statusCode: 200,
47+
body: "Data dump successful",
48+
};
49+
} catch (error) {
50+
console.error("Error:", error);
51+
return {
52+
statusCode: 500,
53+
body: JSON.stringify({ error: "Failed to process the data" }),
54+
};
55+
}
56+
};
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
import { DynamoDBClient, ScanCommand } from "@aws-sdk/client-dynamodb";
2+
import { unmarshall } from "@aws-sdk/util-dynamodb";
3+
const dynamodb = new DynamoDBClient();
4+
const tableName = process.env.DDB_TABLE_NAME;
5+
6+
export const handler = awslambda.streamifyResponse(
7+
async (event, responseStream, context) => {
8+
const httpResponseMetadata = {
9+
statusCode: 200,
10+
headers: {
11+
"Content-Type": "text/html",
12+
},
13+
};
14+
15+
responseStream = awslambda.HttpResponseStream.from(
16+
responseStream,
17+
httpResponseMetadata
18+
);
19+
20+
let counter = 0;
21+
await scanDynamoDBTable();
22+
23+
async function scanDynamoDBTable(startKey = null) {
24+
// Scan table with the required parameters
25+
const scan = new ScanCommand({
26+
TableName: tableName,
27+
ExclusiveStartKey: startKey,
28+
Limit: 200,
29+
});
30+
31+
const data = await dynamodb.send(scan);
32+
33+
// Convert the items from DDB JSON to regular JSON
34+
data.Items = data.Items.map((item) => {
35+
return unmarshall(item);
36+
});
37+
38+
// Send the scan result to the stream
39+
responseStream.write(data.Items);
40+
41+
counter += 1;
42+
43+
// If there are more items to scan, recursively call the scanDynamoDBTable function with the last evaluated key
44+
if (data.LastEvaluatedKey && counter < 10) {
45+
return scanDynamoDBTable(data.LastEvaluatedKey);
46+
}
47+
48+
// End stream
49+
responseStream.end();
50+
}
51+
}
52+
);

0 commit comments

Comments
 (0)