Skip to content

Commit bfa60f2

Browse files
committed
Uses Athena's INSERT INTO, upgrades to node 12
1 parent 95495f2 commit bfa60f2

File tree

4 files changed

+7
-32
lines changed

4 files changed

+7
-32
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ The application has two main parts:
1919

2020
![infrastructure-overview](images/moveAccessLogs.png)
2121

22-
- An hourly scheduled AWS Lambda function `transformPartition` that runs a [Create Table As Select](https://docs.aws.amazon.com/athena/latest/ug/ctas.html) (CTAS) query on a single partition per run, taking one hour of data into account. It writes the content of the partition to the Apache Parquet format into the `<StackName>-cf-access-logs` S3 bucket.
22+
- An hourly scheduled AWS Lambda function `transformPartition` that runs an [INSERT INTO](https://docs.aws.amazon.com/athena/latest/ug/insert-into.html) query on a single partition per run, taking one hour of data into account. It writes the content of the partition to the Apache Parquet format into the `<StackName>-cf-access-logs` S3 bucket.
2323

2424
![infrastructure-overview](images/transformPartition.png)
2525

functions/transformPartition.js

Lines changed: 3 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,6 @@ const sourceTable = process.env.SOURCE_TABLE;
77
const targetTable = process.env.TARGET_TABLE;
88
const database = process.env.DATABASE;
99

10-
// s3 URL to write CTAS results to (including trailing slash)
11-
const athenaCtasResultsLocation = process.env.ATHENA_CTAS_RESULTS_LOCATION;
12-
1310
// get the partition of 2hours ago
1411
exports.handler = async (event, context, callback) => {
1512
var partitionHour = new Date(Date.now() - 120 * 60 * 1000);
@@ -20,35 +17,14 @@ exports.handler = async (event, context, callback) => {
2017

2118
console.log('Transforming Partition', { year, month, day, hour });
2219

23-
var intermediateTable = `ctas_${year}_${month}_${day}_${hour}`;
24-
2520
var ctasStatement = `
26-
CREATE TABLE ${database}.${intermediateTable}
27-
WITH ( format='PARQUET',
28-
external_location='${athenaCtasResultsLocation}year=${year}/month=${month}/day=${day}/hour=${hour}',
29-
parquet_compression = 'SNAPPY')
30-
AS SELECT *
21+
INSERT INTO ${database}.${targetTable}
22+
SELECT *
3123
FROM ${database}.${sourceTable}
3224
WHERE year = '${year}'
3325
AND month = '${month}'
3426
AND day = '${day}'
3527
AND hour = '${hour}';`;
3628

37-
var dropTableStatement = `DROP TABLE ${database}.${intermediateTable};`;
38-
39-
var createNewPartitionStatement = `
40-
ALTER TABLE ${database}.${targetTable}
41-
ADD IF NOT EXISTS
42-
PARTITION (
43-
year = '${year}',
44-
month = '${month}',
45-
day = '${day}',
46-
hour = '${hour}' );`;
47-
48-
await util.runQuery(ctasStatement).then(
49-
() => Promise.all([
50-
util.runQuery(dropTableStatement),
51-
util.runQuery(createNewPartitionStatement)
52-
])
53-
);
29+
await util.runQuery(ctasStatement);
5430
}

images/transformPartition.png

56.3 KB
Loading

template.yaml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ Resources:
4545
Properties:
4646
CodeUri: functions/
4747
Handler: transformPartition.handler
48-
Runtime: nodejs8.10
48+
Runtime: nodejs12.x
4949
Timeout: 900
5050
Policies:
5151
- Version: '2012-10-17'
@@ -83,7 +83,6 @@ Resources:
8383
TARGET_TABLE: !Ref PartitionedParquetTable
8484
DATABASE: !Ref CfLogsDatabase
8585
ATHENA_QUERY_RESULTS_LOCATION: !Sub "s3://${ResourcePrefix}-${AWS::AccountId}-cf-access-logs/athena-query-results"
86-
ATHENA_CTAS_RESULTS_LOCATION: !Sub "s3://${ResourcePrefix}-${AWS::AccountId}-cf-access-logs/${ParquetKeyPrefix}"
8786
Events:
8887
HourlyEvt:
8988
Type: Schedule
@@ -94,7 +93,7 @@ Resources:
9493
Properties:
9594
CodeUri: functions/
9695
Handler: createPartitions.handler
97-
Runtime: nodejs8.10
96+
Runtime: nodejs12.x
9897
Timeout: 5
9998
Policies:
10099
- Version: '2012-10-17'
@@ -135,7 +134,7 @@ Resources:
135134
Properties:
136135
CodeUri: functions/
137136
Handler: moveAccessLogs.handler
138-
Runtime: nodejs8.10
137+
Runtime: nodejs12.x
139138
Timeout: 30
140139
Policies:
141140
- Version: '2012-10-17'

0 commit comments

Comments
 (0)