Skip to content

Commit 5bcd92b

Browse files
author
Bob Strahan
committed
Merge branch 'develop'
2 parents a5d7390 + aa3ea95 commit 5bcd92b

File tree

60 files changed

+4068
-644
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+4068
-644
lines changed

.gitlab-ci.yml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ image: public.ecr.aws/docker/library/python:3.13-bookworm
1616

1717
stages:
1818
- developer_tests
19+
- deployment_validation
1920
- integration_tests
2021

2122
developer_tests:
@@ -93,4 +94,23 @@ integration_tests:
9394
- poetry install
9495
- make put
9596
- make wait
97+
98+
deployment_validation:
99+
stage: deployment_validation
100+
rules:
101+
- when: always
102+
103+
before_script:
104+
- apt-get update -y
105+
- apt-get install curl unzip python3-pip -y
106+
# Install AWS CLI
107+
- curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
108+
- unzip awscliv2.zip
109+
- ./aws/install
110+
# Install PyYAML for template analysis
111+
- pip install PyYAML
112+
113+
script:
114+
# Check if service role has sufficient permissions for main stack deployment
115+
- python3 scripts/validate_service_role_permissions.py
96116

CHANGELOG.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,28 @@ SPDX-License-Identifier: MIT-0
55

66
## [Unreleased]
77

8+
## [0.3.17]
9+
810
### Added
911

12+
- **Edit Sections Feature for Modifying Class/Type and Reprocessing Extraction**
13+
- Added Edit Sections interface for Pattern-2 and Pattern-3 workflows with reprocessing optimization
14+
- **Key Features**: Section management (create, update, delete), classification updates, page reassignment with overlap detection, real-time validation
15+
- **Selective Reprocessing**: Only modified sections are reprocessed while preserving existing data for unmodified sections
16+
- **Processing Pipeline**: All functions (OCR/Classification/Extraction/Assessment) automatically skip redundant operations based on data presence
17+
- **Pattern Compatibility**: Full functionality for Pattern-2/Pattern-3, informative modal for Pattern-1 explaining BDA not yet supported
18+
19+
- **Analytics Agent Schema Optimization for Improved Performance**
20+
- **Embedded Database Overview**: Complete table listing and guidance embedded directly in system prompt (no tool call needed)
21+
- **On-Demand Detailed Schemas**: `get_table_info(['specific_tables'])` loads detailed column information only for tables actually needed by the query
22+
- **Significant Performance Gains**: Eliminates redundant tool calls on every query while maintaining token efficiency
23+
- **Enhanced SQL Guidance**: Comprehensive Athena/Trino function reference with explicit PostgreSQL operator warnings to prevent common query failures like `~` regex operator mistakes
24+
- **Faster Time-to-Query**: Agent has immediate access to table overview and can proceed directly to detailed schema loading for relevant tables
25+
26+
### Fixed
27+
- Fix missing data in Glue tables when using a document class that contains a dash (-).
28+
29+
1030
## [0.3.16]
1131

1232
### Added

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.3.16
1+
0.3.17

docs/deployment.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ You need to have the following packages installed on your computer:
3636
4. python 3.11 or later
3737
5. A local Docker daemon
3838
6. Python packages for publish.py: `pip install boto3 rich typer PyYAML botocore setuptools`
39+
7. **Node.js 18+** and **npm** (required for UI validation in publish script)
3940

4041
For guidance on setting up a development environment, see:
4142
- [Development Environment Setup Guide on Linux](./setup-development-env-linux.md)
@@ -136,12 +137,12 @@ aws cloudformation update-stack \
136137

137138

138139
**Pattern Parameter Options:**
139-
* `Pattern1` - Packet or Media processing with Bedrock Data Automation (BDA)
140+
* `Pattern1 - Packet or Media processing with Bedrock Data Automation (BDA)`
140141
* Can use an existing BDA project or create a new demo project
141-
* `Pattern2` - Packet processing with Textract and Bedrock
142+
* `Pattern2 - Packet processing with Textract and Bedrock`
142143
* Supports both page-level and holistic classification
143144
* Recommended for first-time users
144-
* `Pattern3` - Packet processing with Textract, SageMaker(UDOP), and Bedrock
145+
* `Pattern3 - Packet processing with Textract, SageMaker(UDOP), and Bedrock`
145146
* Requires a UDOP model in S3 that will be deployed on SageMaker
146147

147148
After deployment, check the Outputs tab in the CloudFormation console to find links to dashboards, buckets, workflows, and other solution resources.

docs/pattern-3.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ This pattern implements an intelligent document processing workflow that uses UD
3030

3131
## Fine tuning a UDOP model for classification
3232

33-
See [Fine-Tuning Models on SageMaker](./fine-tune-sm-udop-classification/README.md)
33+
See [Fine-Tuning Models on SageMaker](../patterns/pattern-3/fine-tune-sm-udop-classification/README.md)
3434

3535
Once you have trained the model, deploy the GenAIIDP stack for Pattern-3 using the path for your new fine tuned model.
3636

docs/web-ui.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,66 @@ The solution includes a responsive web-based user interface built with React tha
2626
- **Document Process Flow visualization** for detailed workflow execution monitoring and troubleshooting
2727
- **Document Analytics** for querying and visualizing processed document data
2828

29+
## Edit Sections
30+
31+
The Edit Sections feature provides an intelligent interface for modifying document section classifications and page assignments, with automatic reprocessing optimization for Pattern-2 and Pattern-3 workflows.
32+
33+
### Key Capabilities
34+
35+
- **Section Management**: Create, update, and delete document sections with validation
36+
- **Classification Updates**: Change section document types with real-time validation
37+
- **Page Reassignment**: Move pages between sections with overlap detection
38+
- **Intelligent Reprocessing**: Only modified sections are reprocessed, preserving existing data
39+
- **Immediate Feedback**: Status updates appear instantly in the UI
40+
- **Pattern Compatibility**: Available for Pattern-2 and Pattern-3, with informative guidance for Pattern-1
41+
42+
### How to Use
43+
44+
1. Navigate to a completed document's detail page
45+
2. In the "Document Sections" panel, click the "Edit Sections" button
46+
3. **For Pattern-2/Pattern-3**: Enter edit mode with inline editing capabilities
47+
4. **For Pattern-1**: View informative modal explaining BDA architecture differences
48+
49+
#### Editing Workflow (Pattern-2/Pattern-3)
50+
51+
1. **Edit Section Classifications**: Use dropdowns to change document types
52+
2. **Modify Page Assignments**: Edit comma-separated page IDs (e.g., "1, 2, 3")
53+
3. **Add New Sections**: Click "Add Section" for new document boundaries
54+
4. **Delete Sections**: Use remove buttons to delete unnecessary sections
55+
5. **Validation**: Real-time validation prevents overlapping pages and invalid configurations
56+
6. **Submit Changes**: Click "Save & Process Changes" to trigger selective reprocessing
57+
58+
### Processing Optimization
59+
60+
The Edit Sections feature uses **2-phase schema knowledge optimization**:
61+
62+
#### Phase 1: Frontend
63+
- **Selective Payload**: Only sends sections that actually changed
64+
- **Validation Engine**: Prevents invalid configurations before submission
65+
66+
#### Phase 2: Backend
67+
- **Pipeline**: Processing functions automatically skip redundant operations
68+
- **OCR**: Skips if pages already have OCR data
69+
- **Classification**: Skips if pages already classified
70+
- **Extraction**: Skips if sections have extraction data
71+
- **Assessment**: Skips if extraction results contain assessment data
72+
- **Selective Reprocessing**: Only modified sections lose their data and get reprocessed
73+
74+
### Pattern Compatibility
75+
76+
#### Pattern-2 and Pattern-3 Support
77+
- **Full Functionality**: Complete edit capabilities with intelligent reprocessing
78+
- **Performance Optimization**: Automatic selective processing for efficiency
79+
- **Data Preservation**: Unmodified sections retain all processing results
80+
81+
#### Pattern-1 Information
82+
Pattern-1 uses **Bedrock Data Automation (BDA)** with automatic section management. When Edit Sections is clicked, users see an informative modal explaining:
83+
84+
- **Architecture Differences**: BDA handles section boundaries automatically
85+
- **Alternative Workflows**: Available options like "View/Edit Data", Configuration updates, and document reprocessing
86+
- **Future Considerations**: Guidance on using Pattern-2/Pattern-3 for fine-grained section control
87+
88+
2989
## Document Analytics
3090

3191
The Document Analytics feature allows users to query their processed documents using natural language and receive results in various formats including charts, tables, and text responses.

iam-roles/cloudformation-management/IDP-Cloudformation-Service-Role.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Resources:
1313
CloudFormationServiceRole:
1414
Type: AWS::IAM::Role
1515
Properties:
16-
RoleName: IDPAcceleratorCloudFormationServiceRole
16+
RoleName: !Sub '${AWS::StackName}-CFServiceRole'
1717
AssumeRolePolicyDocument:
1818
Version: '2012-10-17'
1919
Statement:
@@ -109,7 +109,7 @@ Resources:
109109
PassRolePolicy:
110110
Type: AWS::IAM::ManagedPolicy
111111
Properties:
112-
ManagedPolicyName: IDP-PassRolePolicy
112+
ManagedPolicyName: !Sub '${AWS::StackName}-PassRolePolicy'
113113
Description: Policy to allow passing the IDP CloudFormation service role
114114
PolicyDocument:
115115
Version: '2012-10-17'

0 commit comments

Comments
 (0)