pgflow-dev
diff --git a/‎.changeset/hungry-cloths-hunt.md‎
Lines changed: 12 additions & 0 deletions b/‎.changeset/hungry-cloths-hunt.md‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎.claude/commands/fix-sql-tests.md‎
Lines changed: 17 additions & 0 deletions b/‎.claude/commands/fix-sql-tests.md‎
Lines changed: 17 additions & 0 deletions
diff --git a/‎.claude/commands/test-first-sql.md‎
Lines changed: 40 additions & 0 deletions b/‎.claude/commands/test-first-sql.md‎
Lines changed: 40 additions & 0 deletions
diff --git a/‎PLAN.md‎
Lines changed: 36 additions & 36 deletions b/‎PLAN.md‎
Lines changed: 36 additions & 36 deletions
diff --git a/‎PLAN_orphaned_messages.md‎
Lines changed: 0 additions & 184 deletions b/‎PLAN_orphaned_messages.md‎
Lines changed: 0 additions & 184 deletions
@@ -0,0 +1,12 @@
+---
+'@pgflow/core': patch
+---
+
+Improve failure handling and prevent orphaned messages in queue
+
+- Archive all queued messages when a run fails to prevent resource waste
+- Handle type constraint violations gracefully without exceptions
+- Store output on failed tasks (including type violations) for debugging
+- Add performance index for efficient message archiving
+- Prevent retries on already-failed runs
+- Update table constraint to allow output storage on failed tasks
@@ -0,0 +1,17 @@
+Your job is to fix SQL tests, either by fixing the tests if those are invalid,
+or updating the SQL functions in pkgs/core/schemas/ and trying again.
+
+If updating functions, load them with psql.
+
+!`pnpm nx supabase:status core --output env | grep DB_URL`
+PWD: !`pwd`
+
+To rerun the test(s), run this command from `pkgs/core` directory:
+
+`scripts/run-test-with-colors supabase/tests/<testfile>`
+
+Do not create any migratons or try to run tests with nx.
+
+<test_failures>
+!`pnpm nx test:pgtap core`
+</test_failures>
@@ -0,0 +1,40 @@
+Your job is to implement the feature below in a test-first manner.
+First, you must idenfity what things you want to test for.
+Then you must write one test at a time, from the simplest, more generic,
+to more precise (if applicable, sometimes you only need to write one test per
+thing, without multiple per thing).
+
+To run the test(s), run this command from `pkgs/core` directory:
+
+`scripts/run-test-with-colors supabase/tests/<testfile>`
+
+The newly written test must fail for the correct reasons.
+
+In order to make the test pass, you need to update function
+code in pkgs/core/schemas/.
+
+After updating you should use `psql` to execute function file
+and update function in database.
+
+!`pnpm nx supabase:status core --output env | grep DB_URL`
+PWD: !`pwd`
+
+Repeat until all the added tests are passing.
+
+When they do, run all the tests like this:
+
+`scripts/run-test-with-colors supabase/tests/`
+
+Do not create any migratons or try to run tests with nx.
+
+Never use any INSERTs or UPDATEs to prepare or mutate state for the test.
+Instead, use regular pgflow.\* SQL functions or functions that are
+available in pkgs/core/supabase/tests/seed.sql:
+
+!`grep 'function.*pgflow_tests' pkgs/core/supabase/seed.sql -A7`
+
+Check how they are used in other tests.
+
+<feature_to_implement>
+$ARGUMENTS
+</feature_to_implement>
@@ -10,7 +10,7 @@
 - ✅ **DONE**: Array element extraction - tasks receive individual array elements
 - ✅ **DONE**: Output aggregation - inline implementation aggregates map task outputs for dependents
 - ✅ **DONE**: DSL support for `.map()` for defining map steps with compile-time duplicate detection
-- ⏳ **TODO**: Fix orphaned messages on run failure
+- ✅ **DONE**: Fix orphaned messages on run failure
 - ⏳ **TODO**: Performance optimization with step_states.output column
 
 ### Chores
@@ -106,18 +106,28 @@
   - Updated DSL README with .map() documentation
   - Created detailed changeset
 
+- [x] **PR #219: Fix Orphaned Messages on Run Failure** - `09-18-fix-orphaned-messages-on-fail` ✅ COMPLETED
+
+  - Archives all queued messages when run fails (prevents orphaned messages)
+  - Handles type constraint violations gracefully without exceptions
+  - Added guards to prevent any mutations on failed runs:
+    - complete_task returns unchanged
+    - start_ready_steps exits early
+    - cascade_complete_taskless_steps returns 0
+  - Added performance index for efficient message archiving
+  - Tests unstashed and passing (archive_sibling_map_tasks, archive_messages_on_type_constraint_failure)
+  - Updated core README with failure handling mentions
+  - **Critical fix: prevents queue performance degradation in production**
+
 #### ❌ Remaining Work (Priority Order)
 
-- [ ] **Priority 1: Fix Orphaned Messages on Run Failure** 🚨 CRITICAL
+- [ ] **Integration Tests**
 
-  - Archive all pending messages when run fails
-  - Handle map sibling tasks specially
-  - Fix type constraint violations to fail immediately without retries
-  - See detailed plan: [PLAN_orphaned_messages.md](./PLAN_orphaned_messages.md)
-  - **Critical for production: prevents queue performance degradation**
-  - Tests already written (stashed) that document the problem
+  - End-to-end workflows with real array data
+  - Basic happy path coverage
+  - This should be minimal and added to the Edge Worker integration test suite for now
 
-- [ ] **Priority 2: Performance Optimization - step_states.output Column**
+- [ ] **Performance Optimization - step_states.output Column**
 
   - Migrate from inline aggregation to storing outputs in step_states
   - See detailed plan: [PLAN_step_output.md](./PLAN_step_output.md)
@@ -132,43 +142,33 @@
     - Update all aggregation tests (~17 files)
   - **Note**: This is an optimization that should be done after core functionality is stable
 
-- [ ] **Priority 3: Integration Tests**
-
-  - End-to-end workflows with real array data
-  - Basic happy path coverage
-  - This should be minimal and added to the Edge Worker integration test suite for now
-
-- [ ] **Priority 4: Update core README**
-
-  - `pkgs/core/README.md`
-
-    - Add new section describing the step types
-    - Describe single step briefly, focus on describing map step type and how it differs
-    - Make sure to mention that maps are constrained to have exactly one dependency
-    - Show multiple cases of inputs -> task creation
-    - Explain edge cases (empty array propagation, invalid array input)
-    - Explain root map vs dependent map and how it gets handled and what restrictions those apply on the Flow input
-    - Explain cascade completion of taskless steps and its limitations
+- [ ] **Update `pkgs/core/README.md`**
 
-- [ ] **Priority 5: Add docs page**
+  - Add new section describing the step types
+  - Describe single step briefly, focus on describing map step type and how it differs
+  - Make sure to mention that maps are constrained to have exactly one dependency
+  - Show multiple cases of inputs -> task creation
+  - Explain edge cases (empty array propagation, invalid array input)
+  - Explain root map vs dependent map and how it gets handled and what restrictions those apply on the Flow input
+  - Explain cascade completion of taskless steps and its limitations
 
-  - **Add basic docs page**
+- [ ] **Add docs page**
 
-    - put it into `pkgs/website/src/content/docs/concepts/array-and-map-steps.mdx`
-    - describe the DSL and how the map works and why we need it
-    - show example usage of root map
-    - show example usage of dependent map
-    - focus mostly on how to use it, instead of how it works under the hood
-    - link to the README's for more details
+  - put it into `pkgs/website/src/content/docs/concepts/array-and-map-steps.mdx`
+  - describe the DSL and how the map works and why we need it
+  - show example usage of root map
+  - show example usage of dependent map
+  - focus mostly on how to use it, instead of how it works under the hood
+  - link to the README's for more details
 
-- [ ] **Priority 6: Migration Consolidation** (Do this last before merge!)
+- [ ] **Migration Consolidation**
 
   - Remove all temporary/incremental migrations from feature branches
   - Generate a single consolidated migration for the entire map infrastructure
   - Ensure clean migration path from current production schema
   - If NULL improvement is done, include it in the consolidated migration
 
-- [ ] **Priority 7: Graphite Stack Merge**
+- [ ] **Graphite Stack Merge**
 
   - Configure Graphite merge queue for the complete PR stack
   - Ensure all PRs in sequence can be merged together