DAT409 - Hybrid Search with Aurora PostgreSQL for MCP Retrieval

Platform & Infrastructure

Languages & Frameworks

⚠️ Educational Workshop: This repository contains demonstration code for AWS re:Invent 2025. Not intended for production deployment without proper security hardening and testing.

🚀 Overview

Duration: 60 minutes | Level: 400 (Expert)

Build production-grade hybrid search combining semantic vectors, full-text search, and fuzzy matching. Implement Model Context Protocol (MCP) for context-aware retrieval with persona-based security—enabling AI agents to query structured data beyond traditional RAG.

What You'll Build:

Hybrid search with fuzzy, semantic, and RRF methods
MCP-based agent with intelligent database querying
Context-aware filtering with Row-Level Security

📁 Repository Structure

├── notebooks/
│   ├── 01-dat409-hybrid-search-TODO.ipynb      # Hands-on lab with TODO blocks
│   └── 02-dat409-hybrid-search-SOLUTIONS.ipynb # Reference implementation
├── data/
│   └── amazon-products-sample.csv           # 21,704 product dataset
├── demo-app/
│   ├── streamlit_app.py                     # Full-stack reference application
│   ├── requirements.txt
│   └── .streamlit/config.toml
├── scripts/
│   ├── bootstrap-code-editor-unified.sh     # Environment setup
│   └── setup/test_connection.py
├── cfn/                                     # CloudFormation templates
└── requirements.txt                         # Workshop dependencies

🎯 Workshop Structure

Hands-On Lab: Hybrid Search Implementation (40 min)

Complete 3 search methods (6 TODO sections total):

Method	Technology	Use Case
Fuzzy	pg_trgm + GIN	Typo tolerance ("wireles hedphones")
Semantic	pgvector + HNSW + Cohere	Conceptual queries ("eco-friendly products")
Hybrid RRF	Reciprocal Rank Fusion	Multi-signal fusion without ML overhead

Hands-On:

cd /notebooks
# Open 01-dat409-hybrid-search-TODO.ipynb

Key Learning:

When to use each search method
HNSW vs IVFFlat index strategies
RRF vs weighted fusion
Cohere Rerank for ML-based optimization

Interactive Demo: MCP-Based Retrieval (10 min)

Explore MCP-enabled context-aware search:

User Query → Claude Sonnet 4 → MCP Tools → Aurora PostgreSQL
                ↓                  ↓              ↓
         Tool Selection        SQL Query    RLS-Filtered Results

Hands-On:

cd /demo-app
streamlit run streamlit_app.py

Key Learning:

Dynamic retrieval strategy selection
Persona-based RLS for multi-tenant agents
Cohere Rerank vs RRF comparison
Production deployment patterns

🎓 Getting Started

For AWS re:Invent Participants:

Access Code Editor via provided CloudFront URL
Navigate to /notebooks/
Open 01-dat409-hybrid-search-TODO.ipynb
Complete 3 TODO blocks (guided with hints)
Launch demo app: streamlit run demo-app/streamlit_app.py

Pre-Configured Environment:

✅ Aurora PostgreSQL 17.5 with pgvector 0.8.0
✅ 21,704 products with pre-generated Cohere embeddings
✅ Python 3.13 + Jupyter + all dependencies
✅ Amazon Bedrock access (Cohere Embed v3, Rerank v3.5)
✅ MCP server (awslabs.postgres-mcp-server)
✅ Strands Agent Framework + Claude Sonnet 4

💰 Cost Considerations

Bedrock Pricing (us-west-2):

Cohere Embed v3: $0.0001 per 1K tokens
- Workshop dataset: ~$2.17 for 21,704 products (one-time)
- Production: Pre-generate embeddings to avoid repeated costs
Cohere Rerank v3.5: $0.002 per search
- ~$2 per 1,000 searches
- Use for user-facing search where accuracy is critical

Cost Optimization Strategies:

✅ Pre-generate embeddings: One-time cost vs per-query cost
✅ Cache rerank results: Redis with 1-hour TTL (reduces 80%+ of rerank calls)
✅ Use RRF for internal tools: Zero cost, in-database fusion
✅ Batch embedding generation: Process in batches of 96 texts (Cohere limit)

When to Use What:

Cohere Rerank: Customer-facing search, high-value queries (~$0.002/search)
RRF: Internal tools, high-volume, cost-sensitive (~$0/search)
Hybrid without rerank: Balance of accuracy and cost

🛠️ AWS Services

Service	Purpose
Amazon Aurora PostgreSQL	Vector storage with pgvector 0.8.0 extension
Amazon Bedrock	Cohere Embed v3 (embeddings), Rerank v3.5 (ML reranking)
RDS Data API	Serverless, IAM-authenticated database access
Claude Sonnet 4	Natural language → SQL translation via Bedrock

🤖 Why MCP Matters: Beyond RAG

MCP shifts from relevance-based retrieval (RAG) to structured, queryable, context-rich inputs.

Traditional RAG vs MCP

RAG	MCP
❌ Fixed retrieval patterns	✅ Dynamic tool selection
❌ No query-time filtering	✅ Context-aware filtering
❌ Static embeddings only	✅ Hybrid retrieval strategies
❌ Limited multi-step reasoning	✅ Direct structured data access

Architecture

User Query → Claude Sonnet 4 → MCP Tools → Aurora PostgreSQL
                ↓                  ↓              ↓
         Analyzes Intent      SQL Query    RLS-Filtered Results
         Selects Tools        run_query    WHERE persona = ANY(access)

Key Components:

Strands Agent: Orchestration & tool calling
Claude Sonnet 4: Natural language → SQL translation
MCP Client: Standardized database tools (awslabs.postgres-mcp-server)
Aurora Data API: Serverless, IAM-authenticated access
RLS: Application-level security via system prompt

🎯 Key Takeaways

When to Use Each Search Method

Method	Best For	Avoid When
Semantic	Conceptual queries, cross-language, intent-based	Exact SKU lookup, low-latency (<10ms)
Keyword	Exact terms, Boolean queries, structured fields	Typos common, multi-language content
Fuzzy	Typo tolerance, auto-complete, unreliable input	Precision critical, large result sets
Hybrid	Production systems, mixed queries	Single-method suffices

Production Decisions

HNSW vs IVFFlat:

HNSW: User-facing search, >100K vectors, read-heavy (10-50ms queries)
IVFFlat: Rapid prototyping, frequent updates, write-heavy (50-200ms queries)

Cohere Rerank vs RRF:

Cohere Rerank: User-facing search, accuracy critical (~50-200ms latency, cost per request)
RRF: Internal tools, cost-sensitive, low-latency (in-database, zero cost)

Key Insight

MCP enables agents to dynamically select retrieval strategies (vector, keyword, SQL filters) based on query intent—enabling time-based, persona-based, and operational context filtering impossible with static embeddings alone.

📚 Resources

Core Technologies:

pgvector - Vector similarity search
Model Context Protocol - Standardized AI tool protocol
Aurora PostgreSQL - Managed database
PostgreSQL RLS - Row-level security

AWS Services:

Amazon Bedrock - Cohere Embed v3, Rerank v3.5
RDS Data API - Serverless access
Strands Agent Framework - MCP-compatible agents

🚀 Next Steps

Extend This Workshop:

Add time-based filtering (WHERE created_at > NOW() - INTERVAL '7 days')
Implement query caching (Redis/ElastiCache)
Build custom MCP tools for your domain

Production Checklist:

HNSW indexes on vector columns
GIN indexes on tsvector/trigram columns
Connection pooling (PgBouncer/RDS Proxy)
RLS policies and IAM authentication
Audit logging enabled
Monitoring and observability (see below)

Monitoring & Observability:

For production deployments, monitor search performance and database health:

Database Insights: Track query latency, top SQL statements, and database load in real-time
CloudWatch Metrics: Monitor custom metrics for search method usage (semantic vs keyword vs fuzzy) and result quality
Application Logging: Log search queries, response times, and result counts for analysis and optimization

💡 Note: Advanced vector optimization techniques (Binary Quantization, Scalar Quantization) are covered in the companion session DAT406 - Build Agentic AI powered search with Amazon Aurora and Amazon RDS

🤝 Contributing

⭐ Star this repository | 🍴 Fork for your use cases | 🐛 Report issues | 💡 Submit PRs

See CONTRIBUTING.md for guidelines.

📄 License

MIT-0 License - See LICENSE

AWS re:Invent 2025 | DAT409 - 400 Level Expert Session

Hybrid Search with Aurora PostgreSQL for MCP Retrieval

Name		Name	Last commit message	Last commit date
Latest commit History 230 Commits
data		data
demo-app		demo-app
notebooks		notebooks
scripts		scripts
.env.workshop		.env.workshop
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
dat409-code-editor.yml		dat409-code-editor.yml
env_example		env_example
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DAT409 - Hybrid Search with Aurora PostgreSQL for MCP Retrieval

Platform & Infrastructure

Languages & Frameworks

🚀 Overview

📁 Repository Structure

🎯 Workshop Structure

Hands-On Lab: Hybrid Search Implementation (40 min)

Interactive Demo: MCP-Based Retrieval (10 min)

🎓 Getting Started

💰 Cost Considerations

🛠️ AWS Services

🤖 Why MCP Matters: Beyond RAG

Traditional RAG vs MCP

Architecture

🎯 Key Takeaways

When to Use Each Search Method

Production Decisions

Key Insight

📚 Resources

🚀 Next Steps

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

aws-samples/sample-dat409-hybrid-search-aurora-mcp

Folders and files

Latest commit

History

Repository files navigation

DAT409 - Hybrid Search with Aurora PostgreSQL for MCP Retrieval

Platform & Infrastructure

Languages & Frameworks

🚀 Overview

📁 Repository Structure

🎯 Workshop Structure

Hands-On Lab: Hybrid Search Implementation (40 min)

Interactive Demo: MCP-Based Retrieval (10 min)

🎓 Getting Started

💰 Cost Considerations

🛠️ AWS Services

🤖 Why MCP Matters: Beyond RAG

Traditional RAG vs MCP

Architecture

🎯 Key Takeaways

When to Use Each Search Method

Production Decisions

Key Insight

📚 Resources

🚀 Next Steps

🤝 Contributing

📄 License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages