You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Simplify and refocus map step documentation (#242)
Simplifies and refocuses map step documentation based on the plan in `PLAN_simplify-map-step-docs.md`:
- **Renamed page**: `array-and-map-steps.mdx` → `map-steps.mdx` with redirect
- **Reduced duplication**: Consolidated 3 separate handler constraint explanations into one
- **Improved clarity**: Added terminology note, code markers, and task execution details
- **Removed noise**: Deleted verbose examples and common patterns that belonged elsewhere
- **Enhanced concepts**: Clarified step/task distinction and layer separation of concerns
## Key Changes
### Map Steps Concepts (`concepts/map-steps.mdx`)
- Reduced from 411 to 249 lines (40% reduction)
- Added terminology Aside explaining `.array()` vs `.map()`
- Consolidated handler input constraints into single section with clear code examples
- Added explicit statement: "Each map task is a separate function execution in the worker"
- Removed "Common Patterns" section (belongs in how-to guides)
- Simplified NULL handling and limitations sections
- Ensured all array names are plural (e.g., `users` not `userData`)
### Batch Processing How-To (`how-to/batch-process-with-map.mdx`)
- Reduced from 365 to 295 lines
- Added "Understanding Map Step Flavors" opening section
- Promoted context enrichment pattern to main section with problem/solution code markers
- Added debugging examples with task index logging
- Removed duplicate "Passing Additional Context" from gotchas
- Removed verbose notification batching example
### Concepts Architecture Docs
**`concepts/how-pgflow-works.mdx`:**
- Added task explanation: tasks are user-visible for map steps (performance/retry)
- Added layer independence paragraph explaining separation of concerns
**`concepts/index.mdx`:**
- Enhanced core concepts explanation (flows/runs/steps/tasks/dependencies)
- Clarified steps vs tasks distinction
- Removed architecture details (covered thoroughly in "How pgflow Works")
- Kept focus on data model concepts only
### Other Changes
- Added redirect in `astro.config.mjs` for renamed page
- Fixed all array names to use plural forms throughout examples
## Impact
Documentation is now:
- **More focused**: Each page has a clear purpose
- **Less redundant**: Handler constraints explained once, referenced elsewhere
- **More accurate**: Consistent terminology and plural array names
- **Easier to navigate**: Index focuses on concepts, details in appropriate guides
- **Better for learning**: Progressive disclosure with clear conceptual boundaries
Copy file name to clipboardExpand all lines: pkgs/website/src/content/docs/concepts/how-pgflow-works.mdx
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,6 +37,8 @@ Layer cheat-sheet:
37
37
|**SQL Core**| Owns all **state** & orchestration logic | A handful of Postgres tables + functions |
38
38
|**Worker**| Executes user code, then **reports back**| Edge Function by default (but can be anything) |
39
39
40
+
Each layer is independent and focused: The DSL is a convenience layer that mixes flow definitions with handler types. SQL Core manages the state machine for each step, orchestrating based on dependencies without knowing how handlers execute. Workers are stateless - they execute handlers without needing to understand workflows or step relationships.
41
+
40
42
<Asidetype="tip"title="Swap the Worker if you like">
41
43
Edge Worker is just a convenience wrapper.
42
44
Any process that can coordinate with the SQL Core (call `read_with_poll`, `start_tasks`, then `complete_task` / `fail_task`) will work.
@@ -78,14 +80,16 @@ created → started → completed
78
80
79
81
### How workers execute steps
80
82
83
+
Steps execute via **tasks** - units of function execution. For regular steps, this is a simple 1:1 relationship you can ignore. However, map steps create one task per array element, making tasks a user-visible concept for understanding performance and retry behavior. See [Map Steps](/concepts/map-steps/) for details.
84
+
81
85
Behind the scenes, when a step becomes ready, workers handle the execution through a simple 3-call sequence:
82
86
83
-
1.`read_with_poll` → locks queue messages
84
-
2.`start_tasks` → marks task started, builds input
87
+
1.`read_with_poll` → locks queue messages
88
+
2.`start_tasks` → marks task started, builds input
If you're just calling `start_flow()` and checking results, you don't need to know about this coordination. It's handled automatically by pgflow's SQL Core.
92
+
For regular steps, if you're just calling `start_flow()` and checking results, you don't need to know about this coordination. It's handled automatically by pgflow's SQL Core.
This section contains explanatory documentation that helps you understand the "why" behind pgflow's design decisions, architecture, and approach to workflow orchestration.
10
+
pgflow is built on a few key concepts: **flows** are workflow definitions, **runs** are executions of those flows. Each flow has **steps** that can depend on other steps - a step waits for its dependencies to complete and gets access to their outputs.
11
+
12
+
Steps execute through **tasks** (the actual units of work) - regular steps have 1 task, but map steps create one task per array element for parallel processing.
13
+
14
+
Dive deeper:
11
15
12
16
<CardGrid>
13
17
<LinkCard
@@ -21,8 +25,8 @@ This section contains explanatory documentation that helps you understand the "w
21
25
href="/concepts/flow-dsl/"
22
26
/>
23
27
<LinkCard
24
-
title="Array and Map Steps"
28
+
title="Map Steps"
25
29
description="Understanding parallel array processing with map steps in pgflow workflows"
0 commit comments