Skip to content

Commit 09fce52

Browse files
committed
chore: fix readme and add copilot instructions
Signed-off-by: Gordon Smith <GordonJSmith@gmail.com>
1 parent 5e5fc8d commit 09fce52

File tree

3 files changed

+136
-24
lines changed

3 files changed

+136
-24
lines changed

packages/comms/tests/workunit.spec.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ allPeople;
122122
return response;
123123
});
124124
});
125-
});
125+
}, 30000);
126126

127127
describe("Syntax Error", () => {
128128
it("eclSubmit", () => {
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
# @hpcc-js/dataflow Copilot Instructions
2+
3+
## Architecture Overview
4+
5+
This is a **functional data flow library** using JavaScript generators and iterators for lazy evaluation. Think of it as a streaming data pipeline where data flows through activities and is observed by sensors.
6+
7+
**Core Concepts:**
8+
- **Source<T>**: Either `T[]` or `IterableIterator<T>` - the input data
9+
- **Activities**: Transform data as it flows (`map`, `filter`, `sort`) - return iterators
10+
- **Observers/Sensors**: Monitor data without modifying it (`count`, `max`, `mean`) - accumulate state
11+
- **Pipe**: Chains activities together into reusable pipelines with full type safety
12+
13+
**Key Files:**
14+
- `src/activities/activity.ts` - Core type definitions for the entire system
15+
- `src/utils/pipe.ts` - Complex TypeScript type magic for type-safe activity chaining
16+
- `src/observers/observer.ts` - Observer pattern with `observe()` and `peek()` methods
17+
18+
## Critical Patterns
19+
20+
### Dual Signature Pattern (Performance Optimization)
21+
22+
Activities use TypeScript overloads to support both immediate execution and curried usage:
23+
24+
```typescript
25+
// Immediate execution
26+
export function map<T, U>(source: Source<T>, callbackFn: MapCallback<T, U>): IterableIterator<U>;
27+
// Curried (returns reusable activity)
28+
export function map<T, U>(callbackFn: MapCallback<T, U>): IterableActivity<T, U>;
29+
30+
export function map<T, U>(s_or_cb: Source<T> | MapCallback<T, U>, callbackFn?: MapCallback<T, U>) {
31+
return isSource(s_or_cb) ? mapGen(callbackFn!)(s_or_cb) : mapGen(s_or_cb);
32+
}
33+
```
34+
35+
**Performance optimization (in progress):** Activities are being migrated from `isSource()` runtime checks to `arguments.length` checks for better performance. See `sort.ts` for the optimized pattern - it eliminates expensive runtime type inspection in favor of fast argument counting.
36+
37+
### Generator Functions for Lazy Evaluation
38+
39+
All activities use generator functions (`function*`) to enable lazy evaluation:
40+
41+
```typescript
42+
function* (source: Source<T>) {
43+
let i = -1;
44+
for (const item of source) {
45+
yield callbackFn(item, ++i);
46+
}
47+
}
48+
```
49+
50+
This ensures data only flows when consumed (e.g., via `[...iterator]` or `for...of`).
51+
52+
### Observers Accumulate State
53+
54+
Observers have two methods:
55+
- `observe(value, index)` - Called for each item as it flows through
56+
- `peek()` - Returns accumulated result without consuming the iterator
57+
58+
Observers can be inserted into pipes using `sensor()` or converted to activities using `scalar()` or `activity()`.
59+
60+
### Array Mutation Prevention
61+
62+
**Always use `.slice()` before `.sort()` to avoid mutating input arrays:**
63+
64+
```typescript
65+
const arr = Array.isArray(source) ? source.slice() : [...source];
66+
yield* arr.sort(compareFn);
67+
```
68+
69+
This pattern appears in `sort.ts`, `median.ts`, `quartile.ts`.
70+
71+
## Build & Test Workflow
72+
73+
**Build Commands:**
74+
- `npm run build` - Parallel TypeScript compilation + Vite bundling (`run-p gen-types bundle`)
75+
- `npm run gen-types` - Generate `.d.ts` files in `types/` directory
76+
- `npm run bundle` - Vite builds UMD + ES modules to `dist/`
77+
78+
**Testing:**
79+
- `npm test` - Runs type checking + vitest (both node & browser environments)
80+
- `npm run test-vitest` - Vitest only (dual environment: node + chromium)
81+
- `npm run bench` - Performance benchmarks (see `tests/pipe.bench.ts`)
82+
83+
**Test Structure:**
84+
- Each activity/observer has a matching `.spec.ts` file in `tests/`
85+
- `tests/pipe.spec.ts` and `tests/pipe.bench.ts` test pipeline composition
86+
- Tests verify both immediate execution and curried usage patterns
87+
88+
## TypeScript Configuration
89+
90+
- Uses `"allowImportingTsExtensions": true` - **always use `.ts` extensions in imports**
91+
- `"module": "NodeNext"` - ES modules with Node.js compatibility
92+
- Type definitions generated to `types/` directory (not inline with source)
93+
94+
## Common Gotchas
95+
96+
1. **Index tracking:** Most activities use `let i = -1; for (const item) { ++i }` pattern - maintains correct index through transformations
97+
98+
2. **Optional parameters with undefined:** When using `arguments.length` optimization, handle explicit `undefined` (e.g., `sort(source, undefined)` for default sort)
99+
100+
3. **Type inference in pipe():** The `pipe()` function uses sophisticated TypeScript to infer return types - if types break, check that activity input/output types align correctly
101+
102+
4. **Histogram edge cases:** `histogram` has special handling for empty sources - yields empty buckets with NaN bounds for `buckets` option, returns nothing for `min/range` option
103+
104+
5. **Generator initialization:** Generators don't execute until iterated - sensors remain `undefined` until data flows through
105+
106+
## Code Style
107+
108+
- Use generator functions for all iterable activities
109+
- Prefer `for...of` over manual iterator manipulation
110+
- Use `yield*` to delegate to another generator
111+
- Type parameters: `<T = any>` allows inference while providing fallback
112+
- Function naming: `activityGen` helper functions create the generator, exported function handles overload dispatch

0 commit comments

Comments
 (0)