Skip to content

Commit 873bbb0

Browse files
committed
Update test module and guide for SageMaker-Endpoint
1 parent 49c1e10 commit 873bbb0

File tree

5 files changed

+7338
-7
lines changed

5 files changed

+7338
-7
lines changed

README.md

Lines changed: 46 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,7 @@ Download sample data by running the following command:
163163
sh codes/glue/churn-xgboost/script/download_data.sh
164164
```
165165

166-
A sample data will be downloaded in `codes/glue/churn-xgboost/data/input.csv`.
166+
A sample data will be downloaded in `codes/glue/churn-xgboost/data/input.csv`, and double quotes in this file will be removded to format `csv`.
167167

168168
### **Trigger the StateMachine in Step Functions**
169169

@@ -231,6 +231,51 @@ AWS Glue ETL Job Result in AWS S3 Bucket
231231
Amazon SageMaker Training Job Result in AWS S3 Bucket
232232
![sagemaker-training-output](docs/asset/sagemaker-training-output.png)
233233

234+
### **Hot to invoke**
235+
236+
Finally, let's inovoke `SageMaker Endpoint` to make sure it works well.
237+
238+
Before invocation, open `codes/glue/churn-xgboost/script/test_invoke.py` file, and update `profile name` and `endpoint name` according to your configuration.
239+
240+
```python
241+
...
242+
...
243+
244+
os.environ['AWS_PROFILE'] = 'cdk-demo'
245+
_endpoint_name = 'MLOpsPipelineDemo-churn-xgboost'
246+
247+
...
248+
...
249+
```
250+
251+
Invoke the endpoint by executing the following command:
252+
253+
```bash
254+
python3 codes/glue/churn-xgboost/script/test_invoke.py
255+
...
256+
...
257+
0 Invocation ------------------
258+
>>input: 106,0,274.4,120,198.6,82,160.8,62,6.0,3,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,0
259+
>>label: 0
260+
>>prediction: 0.37959378957748413
261+
1 Invocation ------------------
262+
>>input: 28,0,187.8,94,248.6,86,208.8,124,10.6,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1,0,1,0
263+
>>label: 0
264+
>>prediction: 0.03738965839147568
265+
2 Invocation ------------------
266+
>>input: 148,0,279.3,104,201.6,87,280.8,99,7.9,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1,0
267+
>>label: 1
268+
>>prediction: 0.9195730090141296
269+
3 Invocation ------------------
270+
>>input: 132,0,191.9,107,206.9,127,272.0,88,12.6,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,0
271+
>>label: 0
272+
>>prediction: 0.025062650442123413
273+
4 Invocation ------------------
274+
>>input: 92,29,155.4,110,188.5,104,254.9,118,8.0,4,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1
275+
>>label: 0
276+
>>prediction: 0.028299745172262192
277+
```
278+
234279
## How to re-use or upgrade
235280

236281
### **How to re-trigger the StateMachine in Step Functions**

codes/glue/churn-xgboost/script/download_data.sh

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,7 @@ DATA_FILE=input.csv
55

66
mkdir $DATA_PATH
77

8-
curl -o $DATA_PATH/$DATA_FILE https://raw.githubusercontent.com/aws/amazon-sagemaker-examples/master/step-functions-data-science-sdk/automate_model_retraining_workflow/data/customer-churn.csv
8+
curl -o $DATA_PATH/$DATA_FILE https://raw.githubusercontent.com/aws/amazon-sagemaker-examples/master/step-functions-data-science-sdk/automate_model_retraining_workflow/data/customer-churn.csv
9+
10+
# sed -i 's/"//g' $DATA_PATH/$DATA_FILE
11+
sed -i'' -e 's/"//g' $DATA_PATH/$DATA_FILE
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
import os
2+
import csv
3+
import boto3
4+
5+
6+
os.environ['AWS_PROFILE'] = 'cdk-demo'
7+
_endpoint_name = 'MLOpsPipelineDemo-churn-xgboost'
8+
9+
_input_file = 'codes/glue/churn-xgboost/data/input.csv'
10+
_sagemaker = boto3.client('sagemaker-runtime')
11+
12+
def test_invoke(endpoint_name: str, input_file: str, loop_count: int):
13+
with open(input_file) as reader:
14+
for index, line in enumerate(reader):
15+
if index == loop_count:
16+
break
17+
18+
print(f'{index} Invocation ------------------')
19+
line_arr = line.rstrip('\n').split(',')
20+
input = ','.join(line_arr[1:])
21+
label = line_arr[0]
22+
print('>>input: ', input)
23+
print('>>label: ', label)
24+
25+
response = _sagemaker.invoke_endpoint(
26+
EndpointName=endpoint_name,
27+
Body=input,
28+
ContentType='text/csv',
29+
Accept='Accept'
30+
)
31+
print('>>prediction: ', response['Body'].read().decode())
32+
33+
34+
if __name__ == '__main__':
35+
test_invoke(_endpoint_name, _input_file, 5)

config/app-config-demo.json

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -16,16 +16,17 @@
1616
"GlueJobFilePath": "codes/glue/churn-xgboost/src/glue_etl.py",
1717
"GlueJobTimeoutInMin": 30,
1818

19-
"TrainContainerImage": "825641698319.dkr.ecr.us-east-2.amazonaws.com/xgboost:1",
19+
"TrainContainerImage": "825641698319.dkr.ecr.us-east-2.amazonaws.com/xgboost:latest",
2020
"TrainParameters": {
2121
"max_depth": "5",
22+
"eval_metric": "error",
2223
"eta": "0.2",
2324
"gamma": "4",
2425
"min_child_weight": "6",
25-
"subsample": "0.7",
26-
"objective": "multi:softprob",
27-
"num_class": "2",
28-
"num_round": "50"
26+
"subsample": "0.8",
27+
"objective": "binary:logistic",
28+
"silent": "0",
29+
"num_round": "100"
2930
},
3031
"TrainInputContent": "text/csv",
3132
"TrainInstanceType": "c5.xlarge",

0 commit comments

Comments
 (0)