Skip to content

Commit f2114f6

Browse files
fix(pii-redaction): add a strands specific example, reformat context (#177)
* fix(pii-redaction): add strands specific example, reformat context * add navigation for pii-redaction --------- Co-authored-by: Jack Yuan <jackypc@amazon.com>
1 parent b57d9db commit f2114f6

File tree

2 files changed

+122
-22
lines changed

2 files changed

+122
-22
lines changed

docs/user-guide/safety-security/pii-redaction.md

Lines changed: 121 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -18,16 +18,22 @@ Integrating PII redaction is crucial for:
1818
## How to implement PII Redaction
1919

2020
Strands SDK does not natively perform PII redaction within its core telemetry generation but recommends two effective ways to achieve PII masking:
21+
2122
### Option 1: Using Third-Party Specialized Libraries [Recommended]
2223
Leverage specialized external libraries like Langfuse, LLM Guard, Presidio, or AWS Comprehend for high-quality PII detection and redaction:
24+
2325
#### Step-by-Step Integration Guide
26+
2427
##### Step 1: Install your chosen PII Redaction Library.
2528
Example with [LLM Guard](https://protectai.com/llm-guard):
26-
````
29+
30+
````bash
2731
pip install llm-guard
2832
````
29-
##### Step2: Import necessary modules and initialize the Vault and Anonymize scanner.
30-
````
33+
34+
##### Step 2: Import necessary modules and initialize the Vault and Anonymize scanner.
35+
36+
```python
3137
from llm_guard.vault import Vault
3238
from llm_guard.input_scanners import Anonymize
3339
from llm_guard.input_scanners.anonymize_helpers import BERT_LARGE_NER_CONF
@@ -42,24 +48,31 @@ def create_anonymize_scanner():
4248
language="en"
4349
)
4450
return scanner
45-
````
46-
##### Step3: Define a masking function using the anonymize scanner.
47-
````
51+
```
52+
##### Step 3: Define a masking function using the anonymize scanner.
53+
54+
```python
4855
def masking_function(data, **kwargs):
4956
if isinstance(data, str):
5057
scanner = create_anonymize_scanner()
58+
# Scan and redact the data
5159
sanitized_data, is_valid, risk_score = scanner.scan(data)
5260
return sanitized_data
5361
return data
54-
````
55-
##### Step4: Configure the masking function in Observability platform, eg., Langfuse.
56-
````
57-
from langfuse import Langfuse, observe
62+
```
63+
64+
##### Step 4: Configure the masking function in Observability platform, eg., Langfuse.
65+
66+
```python
67+
from langfuse import Langfuse
5868

5969
langfuse = Langfuse(mask=masking_function)
60-
````
61-
##### Step5: Create a sample function with PII.
62-
````
70+
```
71+
72+
##### Step 5: Create a sample function with PII.
73+
74+
```python
75+
from langfuse import observe
6376
@observe()
6477
def generate_report():
6578
report = "John Doe met with Jane Smith to discuss the project."
@@ -70,14 +83,94 @@ print(result)
7083
# Output: [REDACTED_PERSON] met with [REDACTED_PERSON] to discuss the project.
7184

7285
langfuse.flush()
73-
````
86+
```
87+
88+
#### Complete example with a Strands Agent
89+
90+
```python
91+
from strands import Agent
92+
from llm_guard.vault import Vault
93+
from llm_guard.input_scanners import Anonymize
94+
from llm_guard.input_scanners.anonymize_helpers import BERT_LARGE_NER_CONF
95+
from langfuse import Langfuse, observe
96+
97+
vault = Vault()
98+
99+
def create_anonymize_scanner():
100+
"""Creates a reusable anonymize scanner."""
101+
return Anonymize(vault, recognizer_conf=BERT_LARGE_NER_CONF, language="en")
102+
103+
def masking_function(data, **kwargs):
104+
"""Langfuse masking function to recursively redact PII."""
105+
if isinstance(data, str):
106+
scanner = create_anonymize_scanner()
107+
sanitized_data, _, _ = scanner.scan(data)
108+
return sanitized_data
109+
elif isinstance(data, dict):
110+
return {k: masking_function(v) for k, v in data.items()}
111+
elif isinstance(data, list):
112+
return [masking_function(item) for item in data]
113+
return data
114+
115+
langfuse = Langfuse(mask=masking_function)
116+
117+
118+
class CustomerSupportAgent:
119+
def __init__(self):
120+
self.agent = Agent(
121+
system_prompt="You are a helpful customer service agent. Respond professionally to customer inquiries."
122+
)
123+
124+
@observe
125+
def process_sanitized_message(self, sanitized_payload):
126+
"""Processes a pre-sanitized payload and expects sanitized input."""
127+
sanitized_content = sanitized_payload.get("prompt", "empty input")
128+
129+
conversation = f"Customer: {sanitized_content}"
130+
131+
response = self.agent(conversation)
132+
return response
133+
134+
135+
def process():
136+
support_agent = CustomerSupportAgent()
137+
scanner = create_anonymize_scanner()
138+
139+
raw_payload = {
140+
"prompt": "Hi, I'm Jonny Test. My phone number is 123-456-7890 and my email is john@example.com. I need help with my order #123456789."
141+
}
142+
143+
sanitized_prompt, _, _ = scanner.scan(raw_payload["prompt"])
144+
sanitized_payload = {"prompt": sanitized_prompt}
74145

146+
response = support_agent.process_sanitized_message(sanitized_payload)
147+
148+
print(f"Response: {response}")
149+
langfuse.flush()
150+
151+
#Example input: prompt:
152+
# "Hi, I'm [REDACTED_PERSON_1]. My phone number is [REDACTED_PHONE_NUMBER_1] and my email is [REDACTED_EMAIL_ADDRESS_1]. I need help with my order #123456789."
153+
#Example output:
154+
# #Hello! I'd be happy to help you with your order #123456789.
155+
# To better assist you, could you please let me know what specific issue you're experiencing with this order? For example:
156+
# - Are you looking for a status update?
157+
# - Need to make changes to the order?
158+
# - Having delivery issues?
159+
# - Need to process a return or exchange?
160+
#
161+
# Once I understand what you need help with, I'll be able to provide you with the most relevant assistance."
162+
163+
if __name__ == "__main__":
164+
process()
165+
```
75166

76167
### Option 2: Using OpenTelemetry Collector Configuration [Collector-level Masking]
77168
Implement PII masking directly at the collector level, which is ideal for centralized control.
169+
78170
#### Example code:
79171
1. Edit your collector configuration (eg., otel-collector-config.yaml):
80-
````
172+
173+
```yaml
81174
processors:
82175
attributes/pii:
83176
actions:
@@ -92,22 +185,28 @@ service:
92185
pipelines:
93186
traces:
94187
processors: [attributes/pii]
95-
````
188+
```
189+
96190
2. Deploy or restart your OTEL collector with the updated configuration.
191+
97192
#### Example:
193+
98194
##### Before:
99-
````
195+
196+
```json
100197
{
101-
"user.email": "user@example.com",
102-
"http.url": "https://example.com?token=abc123"
198+
"user.email": "user@example.com",
199+
"http.url": "https://example.com?token=abc123"
103200
}
104-
````
201+
```
202+
105203
#### After:
106-
````
204+
205+
```json
107206
{
108207
"http.url": "https://example.com?token=[REDACTED]"
109208
}
110-
````
209+
```
111210

112211
## Additional Resources
113212
* [PII definition](https://www.dol.gov/general/ppii)

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,7 @@ nav:
110110
- Responsible AI: user-guide/safety-security/responsible-ai.md
111111
- Guardrails: user-guide/safety-security/guardrails.md
112112
- Prompt Engineering: user-guide/safety-security/prompt-engineering.md
113+
- PII Redaction: user-guide/safety-security/pii-redaction.md
113114
- Observability & Evaluation:
114115
- Observability: user-guide/observability-evaluation/observability.md
115116
- Metrics: user-guide/observability-evaluation/metrics.md

0 commit comments

Comments
 (0)