Simplify and refocus map step documentation (#242)

jumski · jumski · commit 25d2ca74fdb6 · 2025-10-08T16:24:26.000Z
Simplifies and refocuses map step documentation based on the plan in `PLAN_simplify-map-step-docs.md`:

- **Renamed page**: `array-and-map-steps.mdx` → `map-steps.mdx` with redirect
- **Reduced duplication**: Consolidated 3 separate handler constraint explanations into one
- **Improved clarity**: Added terminology note, code markers, and task execution details
- **Removed noise**: Deleted verbose examples and common patterns that belonged elsewhere
- **Enhanced concepts**: Clarified step/task distinction and layer separation of concerns

## Key Changes

### Map Steps Concepts (`concepts/map-steps.mdx`)
- Reduced from 411 to 249 lines (40% reduction)
- Added terminology Aside explaining `.array()` vs `.map()`
- Consolidated handler input constraints into single section with clear code examples
- Added explicit statement: "Each map task is a separate function execution in the worker"
- Removed "Common Patterns" section (belongs in how-to guides)
- Simplified NULL handling and limitations sections
- Ensured all array names are plural (e.g., `users` not `userData`)

### Batch Processing How-To (`how-to/batch-process-with-map.mdx`)
- Reduced from 365 to 295 lines
- Added "Understanding Map Step Flavors" opening section
- Promoted context enrichment pattern to main section with problem/solution code markers
- Added debugging examples with task index logging
- Removed duplicate "Passing Additional Context" from gotchas
- Removed verbose notification batching example

### Concepts Architecture Docs
**`concepts/how-pgflow-works.mdx`:**
- Added task explanation: tasks are user-visible for map steps (performance/retry)
- Added layer independence paragraph explaining separation of concerns

**`concepts/index.mdx`:**
- Enhanced core concepts explanation (flows/runs/steps/tasks/dependencies)
- Clarified steps vs tasks distinction
- Removed architecture details (covered thoroughly in "How pgflow Works")
- Kept focus on data model concepts only

### Other Changes
- Added redirect in `astro.config.mjs` for renamed page
- Fixed all array names to use plural forms throughout examples

## Impact

Documentation is now:
- **More focused**: Each page has a clear purpose
- **Less redundant**: Handler constraints explained once, referenced elsewhere
- **More accurate**: Consistent terminology and plural array names
- **Easier to navigate**: Index focuses on concepts, details in appropriate guides
- **Better for learning**: Progressive disclosure with clear conceptual boundaries
diff --git a/pkgs/website/astro.config.mjs b/pkgs/website/astro.config.mjs
@@ -53,6 +53,9 @@ export default defineConfig({
     // Route rename
     '/hire/': '/author/',
 
+    // Page rename redirects
+    '/concepts/array-and-map-steps/': '/concepts/map-steps/',
+
     // Existing redirects
     '/edge-worker/how-to/run-on-hosted-supabase/':
       '/how-to/deploy-to-supabasecom/',
diff --git a/pkgs/website/src/content/docs/concepts/flow-dsl.mdx b/pkgs/website/src/content/docs/concepts/flow-dsl.mdx
@@ -184,7 +184,7 @@ Use `.map()` when you need to:
 - Make batch API calls or database queries
 - Apply the same operation to a list of items
 
-For detailed information about how map steps work internally, see [Array and Map Steps](/concepts/array-and-map-steps/).
+For detailed information about how map steps work internally, see [Map Steps](/concepts/map-steps/).
 :::
 
 ## Task Implementation
diff --git a/pkgs/website/src/content/docs/concepts/how-pgflow-works.mdx b/pkgs/website/src/content/docs/concepts/how-pgflow-works.mdx
@@ -37,6 +37,8 @@ Layer cheat-sheet:
 | **SQL Core** | Owns all **state** & orchestration logic | A handful of Postgres tables + functions |
 | **Worker** | Executes user code, then **reports back** | Edge Function by default (but can be anything) |
 
+Each layer is independent and focused: The DSL is a convenience layer that mixes flow definitions with handler types. SQL Core manages the state machine for each step, orchestrating based on dependencies without knowing how handlers execute. Workers are stateless - they execute handlers without needing to understand workflows or step relationships.
+
 <Aside type="tip" title="Swap the Worker if you like">
 Edge Worker is just a convenience wrapper.
 Any process that can coordinate with the SQL Core (call `read_with_poll`, `start_tasks`, then `complete_task` / `fail_task`) will work.
@@ -78,14 +80,16 @@ created → started → completed
 
 ### How workers execute steps
 
+Steps execute via **tasks** - units of function execution. For regular steps, this is a simple 1:1 relationship you can ignore. However, map steps create one task per array element, making tasks a user-visible concept for understanding performance and retry behavior. See [Map Steps](/concepts/map-steps/) for details.
+
 Behind the scenes, when a step becomes ready, workers handle the execution through a simple 3-call sequence:
 
-1. `read_with_poll` → locks queue messages  
-2. `start_tasks` → marks task started, builds input  
+1. `read_with_poll` → locks queue messages
+2. `start_tasks` → marks task started, builds input
 3. `complete_task/fail_task` → finishes task, moves step forward
 
 <Aside type="note" title="Implementation detail">
-If you're just calling `start_flow()` and checking results, you don't need to know about this coordination. It's handled automatically by pgflow's SQL Core.
+For regular steps, if you're just calling `start_flow()` and checking results, you don't need to know about this coordination. It's handled automatically by pgflow's SQL Core.
 </Aside>
 
 :::note[Looks just like "queue-of-queues"]
diff --git a/pkgs/website/src/content/docs/concepts/index.mdx b/pkgs/website/src/content/docs/concepts/index.mdx
@@ -7,7 +7,11 @@ sidebar:
 
 import { LinkCard, CardGrid } from '@astrojs/starlight/components';
 
-This section contains explanatory documentation that helps you understand the "why" behind pgflow's design decisions, architecture, and approach to workflow orchestration.
+pgflow is built on a few key concepts: **flows** are workflow definitions, **runs** are executions of those flows. Each flow has **steps** that can depend on other steps - a step waits for its dependencies to complete and gets access to their outputs.
+
+Steps execute through **tasks** (the actual units of work) - regular steps have 1 task, but map steps create one task per array element for parallel processing.
+
+Dive deeper:
 
 <CardGrid>
   <LinkCard
@@ -21,8 +25,8 @@ This section contains explanatory documentation that helps you understand the "w
     href="/concepts/flow-dsl/"
   />
   <LinkCard
-    title="Array and Map Steps"
+    title="Map Steps"
     description="Understanding parallel array processing with map steps in pgflow workflows"
-    href="/concepts/array-and-map-steps/"
+    href="/concepts/map-steps/"
   />
 </CardGrid>
diff --git a/pkgs/website/src/content/docs/concepts/map-steps.mdx b/pkgs/website/src/content/docs/concepts/map-steps.mdx
diff --git a/pkgs/website/src/content/docs/how-to/batch-process-with-map.mdx b/pkgs/website/src/content/docs/how-to/batch-process-with-map.mdx