Release web-csv-toolbox@0.14.0 · kamiazya/web-csv-toolbox

Minor Changes

#608 24f04d7 Thanks @kamiazya! - feat!: rename binary stream APIs for consistency and add BufferSource support

Summary

This release standardizes the naming of binary stream parsing APIs to match the existing parseBinary* family, and extends support to accept any BufferSource type (ArrayBuffer, Uint8Array, and other TypedArray views).

Breaking Changes

API Renaming for Consistency

All parseUint8Array* functions have been renamed to parseBinary* to maintain consistency with existing binary parsing APIs:

Function Names:
- parseUint8ArrayStream() → parseBinaryStream()
- parseUint8ArrayStreamToStream() → parseBinaryStreamToStream()
Type Names:
- ParseUint8ArrayStreamOptions → ParseBinaryStreamOptions
Internal Functions (for reference):
- parseUint8ArrayStreamInMain() → parseBinaryStreamInMain()
- parseUint8ArrayStreamInWorker() → parseBinaryStreamInWorker()
- parseUint8ArrayStreamInWorkerWASM() → parseBinaryStreamInWorkerWASM()
Rationale:
The previous naming was inconsistent with the rest of the binary API family (parseBinary, parseBinaryToArraySync, parseBinaryToIterableIterator, parseBinaryToStream). The new naming provides:
- Perfect consistency across all binary parsing APIs
- Clear indication that these functions accept any binary data format
- Better predictability for API discovery
BufferSource Support

FlexibleBinaryCSVParser and BinaryCSVParserStream now accept BufferSource (= ArrayBuffer | ArrayBufferView) instead of just Uint8Array:

Before:
```
const parser = new FlexibleBinaryCSVParser({ header: ['name', 'age'] });
const data = new Uint8Array([...]); // Only Uint8Array
const records = parser.parse(data);
```
After:
```
const parser = new FlexibleBinaryCSVParser({ header: ['name', 'age'] });

// Uint8Array still works
const uint8Data = new Uint8Array([...]);
const records1 = parser.parse(uint8Data);

// ArrayBuffer now works directly
const buffer = await fetch('data.csv').then(r => r.arrayBuffer());
const records2 = parser.parse(buffer);

// Other TypedArray views also work
const int8Data = new Int8Array([...]);
const records3 = parser.parse(int8Data);
```
Benefits:
- Direct use of fetch().then(r => r.arrayBuffer()) without conversion
- Flexibility to work with any TypedArray view
- Alignment with Web API standards (BufferSource is widely used)
Migration Guide

Automatic Migration

Use find-and-replace in your codebase:
```
# Function calls
parseUint8ArrayStream → parseBinaryStream
parseUint8ArrayStreamToStream → parseBinaryStreamToStream

# Type references
ParseUint8ArrayStreamOptions → ParseBinaryStreamOptions
```
TypeScript Users

If you were explicitly typing with Uint8Array, you can now use the more general BufferSource:
```
// Before
function processCSV(data: Uint8Array) {
  return parseBinaryStream(data);
}

// After (more flexible)
function processCSV(data: BufferSource) {
  return parseBinaryStream(data);
}
```
Updated API Consistency

All binary parsing APIs now follow a consistent naming pattern:
```
// Single-value binary data
parseBinary(); // Binary → AsyncIterableIterator<Record>
parseBinaryToArraySync(); // Binary → Array<Record> (sync)
parseBinaryToIterableIterator(); // Binary → IterableIterator<Record>
parseBinaryToStream(); // Binary → ReadableStream<Record>

// Streaming binary data
parseBinaryStream(); // ReadableStream<Uint8Array> → AsyncIterableIterator<Record>
parseBinaryStreamToStream(); // ReadableStream<Uint8Array> → ReadableStream<Record>
```
Note: While the stream input type remains ReadableStream<Uint8Array> (Web Streams API standard), the internal parsers now accept BufferSource for individual chunks.

Documentation Updates

README.md
- Updated Low-level APIs section to reflect parseBinaryStream* naming
- Added flush procedure documentation for streaming mode
- Added BufferSource examples
API Reference (docs/reference/package-exports.md)
- Added comprehensive Low-level API Reference section
- Documented all Parser Models (Tier 1) and Lexer + Assembler (Tier 2)
- Included usage examples and code snippets
Architecture Guide (docs/explanation/parsing-architecture.md)
- Updated Binary CSV Parser section to document BufferSource support
- Added detailed streaming mode examples with flush procedures
- Clarified multi-byte character handling across chunk boundaries
Flush Procedure Clarification

Documentation now explicitly covers the requirement to call parse() without arguments when using streaming mode:
```
const parser = createBinaryCSVParser({ header: ["name", "age"] });
const encoder = new TextEncoder();

// Process chunks
const records1 = parser.parse(encoder.encode("Alice,30\nBob,"), {
  stream: true,
});
const records2 = parser.parse(encoder.encode("25\n"), { stream: true });

// IMPORTANT: Flush remaining data (required!)
const records3 = parser.parse();
```
This prevents data loss from incomplete records or multi-byte character buffers.

Type Safety

All changes maintain full TypeScript strict mode compliance with proper type inference and generic constraints.
#608 24f04d7 Thanks @kamiazya! - Add arrayBufferThreshold option to Engine configuration for automatic Blob reading strategy selection

New Feature

Added engine.arrayBufferThreshold option that automatically selects the optimal Blob reading strategy based on file size:
- Files smaller than threshold: Use blob.arrayBuffer() + parseBinary() (6-8x faster, confirmed by benchmarks)
- Files equal to or larger than threshold: Use blob.stream() + parseBinaryStream() (memory-efficient)
Default: 1MB (1,048,576 bytes), determined by comprehensive benchmarks

Applies to: parseBlob() and parseFile() only

Benchmark Results

File Size Binary (ops/sec) Stream (ops/sec) Performance Gain

1KB 21,691 2,685 8.08x faster

10KB 2,187 311 7.03x faster

100KB 219 32 6.84x faster

1MB 20 3 6.67x faster

Usage
```
import { parseBlob, EnginePresets } from "web-csv-toolbox";

// Use default (1MB threshold)
for await (const record of parseBlob(file)) {
  console.log(record);
}

// Always use streaming (memory-efficient)
for await (const record of parseBlob(largeFile, {
  engine: { arrayBufferThreshold: 0 },
})) {
  console.log(record);
}

// Custom threshold (512KB)
for await (const record of parseBlob(file, {
  engine: { arrayBufferThreshold: 512 * 1024 },
})) {
  console.log(record);
}

// With preset
for await (const record of parseBlob(file, {
  engine: EnginePresets.fastest({
    arrayBufferThreshold: 2 * 1024 * 1024, // 2MB
  }),
})) {
  console.log(record);
}
```
Special Values
- 0 - Always use streaming (maximum memory efficiency)
- Infinity - Always use arrayBuffer (maximum performance for small files)
Security Note

When using arrayBufferThreshold > 0, files must stay below maxBufferSize (default 10MB) to prevent excessive memory allocation. Files exceeding this limit will throw a RangeError.

Design Philosophy

This option belongs to engine configuration because it affects performance and behavior only, not the parsing result specification. This follows the design principle:
- Top-level options: Affect specification (result changes)
- Engine options: Affect performance/behavior (same result, different execution)
#608 24f04d7 Thanks @kamiazya! - Add support for Blob, File, and Request objects

This release adds native support for parsing CSV data from Web Standard Blob, File, and Request objects, making the library more versatile across different environments.

New Functions:
- parseBlob(blob, options) - Parse CSV from Blob or File objects
  - Automatic charset detection from blob.type property
  - Supports compression via decompression option
  - Returns AsyncIterableIterator<CSVRecord>
  - Includes .toArray() and .toStream() namespace methods
- parseFile(file, options) - Enhanced File parsing with automatic error source tracking
  - Built on top of parseBlob with additional functionality
  - Automatically sets file.name as error source for better error reporting
  - Provides clearer intent when working specifically with File objects
  - Useful for file inputs and drag-and-drop scenarios
  - Includes .toArray() and .toStream() namespace methods
- parseRequest(request, options) - Server-side Request parsing
  - Automatic Content-Type validation and charset extraction
  - Automatic Content-Encoding detection and decompression
  - Designed for Cloudflare Workers, Service Workers, and edge platforms
  - Includes .toArray() and .toStream() namespace methods
High-level API Integration:

The parse() function now automatically detects and handles these new input types:
```
import { parse } from "web-csv-toolbox";

// Blob/File (browser file uploads)
// File objects automatically include filename in error messages
const file = input.files[0];
for await (const record of parse(file)) {
  console.log(record);
}

// Request (server-side)
export default {
  async fetch(request: Request) {
    for await (const record of parse(request)) {
      console.log(record);
    }
  },
};
```
Type System Updates:
- Updated CSVBinary type to include Blob and Request
- Added proper type overloads to parse() function
- Full TypeScript support with generic header types
- New source field in CommonOptions, CSVRecordAssemblerOptions, and ParseError
  - Allows custom error source identification (e.g., filename, description)
  - Automatically populated for File objects
  - Improves error messages with contextual information
- Improved internal type naming for better clarity
  - Join → JoinCSVFields - More descriptive CSV field joining utility type
  - Split → SplitCSVFields - More descriptive CSV field splitting utility type
  - These are internal utility types used for CSV type-level string manipulation
- Enhanced terminology in type definitions
  - TokenLocation.rowNumber - Logical CSV row number (includes header)
  - Clear distinction between physical line numbers (line) and logical row numbers (rowNumber)
Compression Support:

All binary input types support compressed data:
- Blob/File: Manual specification via decompression option
```
parseBlob(file, { decompression: "gzip" });
```
- Request: Automatic detection from Content-Encoding header
```
// No configuration needed - automatic
parseRequest(request);
```
- Supported formats: gzip, deflate, deflate-raw (environment-dependent)
Helper Functions:
- getOptionsFromBlob() - Extracts charset from Blob MIME type
- getOptionsFromFile() - Extracts options from File (charset + automatic source naming)
- getOptionsFromRequest() - Processes Request headers (Content-Type, Content-Encoding)
- parseBlobToStream() - Stream conversion helper
- parseFileToArray() - Parse File to array of records
- parseFileToStream() - Parse File to ReadableStream
- parseRequestToStream() - Stream conversion helper
Documentation:

Comprehensive documentation following Diátaxis framework:
- API Reference:
  - parseBlob.md - Complete API reference with examples
  - parseFile.md - Alias documentation
  - parseRequest.md - Server-side API reference with examples
  - Updated parse.md to include new input types
- How-to Guides:
  - NEW: platform-usage/ - Environment-specific usage patterns organized by platform
    - Each topic has its own dedicated guide for easy navigation
    - Browser: File input, drag-and-drop, FormData, Fetch API
    - Node.js: Buffer, fs.ReadStream, HTTP requests, stdin/stdout
    - Deno: Deno.readFile, Deno.open, fetch API
  - Organized in {environment}/{topic}.md structure for maintainability
- Examples:
  - File input elements with HTML samples
  - Drag-and-drop file uploads
  - Compressed file handling (.csv.gz)
  - Validation and error handling patterns
  - NEW: Node.js Buffer usage (supported via BufferSource compatibility)
  - NEW: FormData integration patterns
  - NEW: Node.js stream conversion (fs.ReadStream → Web Streams)
- Updated:
  - README.md - Added usage examples and API listings
  - choosing-the-right-api.md - Updated decision tree
Enhanced Error Reporting:

The source field provides better error context when parsing multiple files:
```
import { parseFile } from "web-csv-toolbox";

// Automatic source tracking
try {
  for await (const record of parseFile(file)) {
    // ...
  }
} catch (error) {
  console.error(error.message);
  // "Field count (100001) exceeded maximum allowed count of 100000 at row 5 in "data.csv""
  console.error(error.source); // "data.csv"
}

// Manual source specification
parseString(csv, { source: "API-Export-2024" });
// Error: "... at row 5 in "API-Export-2024""
```
Security Note: The source field should not contain sensitive information (API keys, tokens, URLs with credentials) as it may be exposed in error messages and logs.

Use Cases:

✅ Browser File Uploads:
- File input elements (<input type="file">)
- Drag-and-drop interfaces
- Compressed file support (.csv.gz)
✅ Server-Side Processing:
- Node.js servers
- Deno applications
- Service Workers
✅ Automatic Header Processing:
- Content-Type validation
- Charset detection
- Content-Encoding decompression
Platform Support:

All new APIs work across:
- Modern browsers (Chrome, Firefox, Edge, Safari)
- Node.js 18+ (via undici Request/Blob)
- Deno
- Service Workers
Breaking Changes:

None - this is a purely additive feature. All existing APIs remain unchanged.

Migration:

No migration needed. New functions are available immediately:
```
// Before (still works)
import { parse } from "web-csv-toolbox";
const response = await fetch("data.csv");
for await (const record of parse(response)) {
}

// After (new capabilities)
import { parseBlob, parseFile, parseRequest } from "web-csv-toolbox";

// Blob support
for await (const record of parseBlob(blob)) {
}

// File support with automatic error source
const file = input.files[0];
for await (const record of parseFile(file)) {
}
// Errors will include: 'in "data.csv"'

// Server-side Request support
for await (const record of parseRequest(request)) {
}

// Custom error source for any parser
import { parseString } from "web-csv-toolbox";
for await (const record of parseString(csv, { source: "user-import.csv" })) {
}
```

File Size	Binary (ops/sec)	Stream (ops/sec)	Performance Gain
1KB	21,691	2,685	8.08x faster
10KB	2,187	311	7.03x faster
100KB	219	32	6.84x faster
1MB	20	3	6.67x faster

#608 24f04d7 Thanks @kamiazya! - Implement discriminated union pattern for EngineConfig to improve type safety

Breaking Changes

1. EngineConfig Type Structure

EngineConfig is now a discriminated union based on the worker property:

Before:

interface EngineConfig {
  worker?: boolean;
  workerURL?: string | URL;
  workerPool?: WorkerPool;
  workerStrategy?: WorkerCommunicationStrategy;
  strict?: boolean;
  onFallback?: (info: EngineFallbackInfo) => void;
  wasm?: boolean;
  // ... other properties
}

After:

// Base configuration shared by all modes
interface BaseEngineConfig {
  wasm?: boolean;
  arrayBufferThreshold?: number;
  backpressureCheckInterval?: BackpressureCheckInterval;
  queuingStrategy?: QueuingStrategyConfig;
}

// Main thread configuration (worker is false or undefined)
interface MainThreadEngineConfig extends BaseEngineConfig {
  worker?: false;
}

// Worker configuration (worker must be true)
interface WorkerEngineConfig extends BaseEngineConfig {
  worker: true;
  workerURL?: string | URL;
  workerPool?: WorkerPool;
  workerStrategy?: WorkerCommunicationStrategy;
  strict?: boolean;
  onFallback?: (info: EngineFallbackInfo) => void;
}

// Union type
type EngineConfig = MainThreadEngineConfig | WorkerEngineConfig;

2. Type Safety Improvements

Worker-specific properties are now only available when worker: true:

// ✅ Valid - worker: true allows worker-specific properties
const config1: EngineConfig = {
  worker: true,
  workerURL: "./worker.js", // ✅ Type-safe
  workerStrategy: "stream-transfer",
  strict: true,
};

// ✅ Valid - worker: false doesn't require worker properties
const config2: EngineConfig = {
  worker: false,
  wasm: true,
};

// ❌ Type Error - worker: false cannot have workerURL
const config3: EngineConfig = {
  worker: false,
  workerURL: "./worker.js", // ❌ Type error!
};

3. EnginePresets Options Split

EnginePresetOptions is now split into two interfaces for better type safety:

Before:

interface EnginePresetOptions {
  workerPool?: WorkerPool;
  workerURL?: string | URL;
  onFallback?: (info: EngineFallbackInfo) => void;
  arrayBufferThreshold?: number;
  // ...
}

EnginePresets.mainThread(options?: EnginePresetOptions)
EnginePresets.fastest(options?: EnginePresetOptions)

After:

// For main thread presets (mainThread, wasm)
interface MainThreadPresetOptions extends BasePresetOptions {
  // No worker-related options
}

// For worker-based presets (worker, fastest, balanced, etc.)
interface WorkerPresetOptions extends BasePresetOptions {
  workerPool?: WorkerPool;
  workerURL?: string | URL;
  onFallback?: (info: EngineFallbackInfo) => void;
}

EnginePresets.mainThread(options?: MainThreadPresetOptions)
EnginePresets.fastest(options?: WorkerPresetOptions)

Migration:

// Before: No type error, but logically incorrect
EnginePresets.mainThread({ workerURL: "./worker.js" }); // Accepted but ignored

// After: Type error prevents mistakes
EnginePresets.mainThread({ workerURL: "./worker.js" }); // ❌ Type error!

4. Transformer Constructor Changes

Queuing strategy parameters changed from optional (?) to default parameters:

Before:

constructor(
  options?: CSVLexerTransformerOptions,
  writableStrategy?: QueuingStrategy<string>,
  readableStrategy?: QueuingStrategy<Token>
)

After:

constructor(
  options: CSVLexerTransformerOptions = {},
  writableStrategy: QueuingStrategy<string> = DEFAULT_WRITABLE_STRATEGY,
  readableStrategy: QueuingStrategy<Token> = DEFAULT_READABLE_STRATEGY
)

Impact: This is technically a breaking change in the type signature, but functionally backward compatible since all parameters still have defaults. Existing code will continue to work without modifications.

New Features

1. Default Strategy Constants

Default queuing strategies are now module-level constants using CountQueuingStrategy:

// CSVLexerTransformer
const DEFAULT_WRITABLE_STRATEGY: QueuingStrategy<string> = {
  highWaterMark: 65536,
  size: (chunk) => chunk.length,
};
const DEFAULT_READABLE_STRATEGY = new CountQueuingStrategy({
  highWaterMark: 1024,
});

// CSVRecordAssemblerTransformer
const DEFAULT_WRITABLE_STRATEGY = new CountQueuingStrategy({
  highWaterMark: 1024,
});
const DEFAULT_READABLE_STRATEGY = new CountQueuingStrategy({
  highWaterMark: 256,
});

2. Type Tests

Added comprehensive type tests in src/common/types.test-d.ts to validate the discriminated union behavior:

// Validates type narrowing
const config: EngineConfig = { worker: true };
expectTypeOf(config).toExtend<WorkerEngineConfig>();

// Validates property exclusion
expectTypeOf<MainThreadEngineConfig>().not.toHaveProperty("workerURL");

Migration Guide

For TypeScript Users

If you're passing EngineConfig objects explicitly typed, you may need to update:

// Before: Could accidentally mix incompatible properties
const config: EngineConfig = {
  worker: false,
  workerURL: "./worker.js", // Silently ignored
};

// After: TypeScript catches the mistake
const config: EngineConfig = {
  worker: false,
  // workerURL: './worker.js'  // ❌ Type error - removed
};

For EnginePresets Users

Update preset option types if explicitly typed:

// Before
const options: EnginePresetOptions = {
  workerPool: myPool,
};
EnginePresets.mainThread(options); // No error, but workerPool ignored

// After
const options: WorkerPresetOptions = {
  // or MainThreadPresetOptions
  workerPool: myPool,
};
EnginePresets.fastest(options); // ✅ Correct usage
// EnginePresets.mainThread(options);  // ❌ Type error - use MainThreadPresetOptions

For Transformer Users

No code changes required. Existing usage continues to work:

// Still works exactly as before
new CSVLexerTransformer();
new CSVLexerTransformer({ delimiter: "," });
new CSVLexerTransformer({}, customWritable, customReadable);

Benefits

IDE Autocomplete: Better suggestions based on worker setting
Type Safety: Prevents invalid property combinations
Self-Documenting: Type system enforces valid configurations
Catch Errors Early: TypeScript catches configuration mistakes at compile time
Standards Compliance: Uses CountQueuingStrategy from Web Streams API

#608 24f04d7 Thanks @kamiazya! - refactor!: rename engine presets to clarify optimization targets

This release improves the naming of engine presets to clearly indicate what each preset optimizes for. The new names focus on performance characteristics (stability, UI responsiveness, parse speed, memory efficiency) rather than implementation details.

Breaking Changes

Engine Preset Renaming

Engine presets have been renamed to better communicate their optimization targets:

- import { EnginePresets } from 'web-csv-toolbox';
+ import { EnginePresets } from 'web-csv-toolbox';

- engine: EnginePresets.mainThread()
+ engine: EnginePresets.stable()

- engine: EnginePresets.worker()
+ engine: EnginePresets.responsive()

- engine: EnginePresets.workerStreamTransfer()
+ engine: EnginePresets.memoryEfficient()

- engine: EnginePresets.wasm()
+ engine: EnginePresets.fast()

- engine: EnginePresets.workerWasm()
+ engine: EnginePresets.responsiveFast()

Optimization targets:

Preset	Optimizes For
`stable()`	Stability (uses only standard JavaScript APIs)
`responsive()`	UI responsiveness (non-blocking)
`memoryEfficient()`	Memory efficiency (zero-copy streams)
`fast()`	Parse speed (fastest execution time)
`responsiveFast()`	UI responsiveness + parse speed
`balanced()`	Balanced (general-purpose)

Removed Presets

Two presets have been removed:

- engine: EnginePresets.fastest()
+ engine: EnginePresets.responsiveFast()

- engine: EnginePresets.strict()
  // No replacement - limited use case

Why removed:

fastest(): Misleading name - prioritized UI responsiveness over raw execution speed due to worker communication overhead
strict(): Limited use case - primarily for testing/debugging

Improvements

Clearer Performance Documentation

Each preset now explicitly documents its performance characteristics:

Parse speed: How fast CSV parsing executes
UI responsiveness: Whether parsing blocks the main thread
Memory efficiency: Memory usage patterns
Stability: API stability level (Most Stable, Stable, Experimental)

Trade-offs Transparency

Documentation now clearly explains the trade-offs for each preset:

// stable() - Most stable, blocks main thread
// ✅ Most stable: Uses only standard JavaScript APIs
// ✅ No worker communication overhead
// ❌ Blocks main thread during parsing

// responsive() - Non-blocking, stable
// ✅ Non-blocking UI: Parsing runs in worker thread
// ⚠️ Worker communication overhead

// fast() - Fastest parse speed, blocks main thread
// ✅ Fast parse speed: Compiled WASM code
// ✅ No worker communication overhead
// ❌ Blocks main thread
// ❌ UTF-8 encoding only

// responsiveFast() - Non-blocking + fast, stable
// ✅ Non-blocking UI + fast parsing
// ⚠️ Worker communication overhead
// ❌ UTF-8 encoding only

Migration Guide

Quick Migration

Replace old preset names with new names:

mainThread() → stable() - If you need maximum stability
worker() → responsive() - If you need non-blocking UI
workerStreamTransfer() → memoryEfficient() - If you need memory efficiency
wasm() → fast() - If you need fastest parse speed (and blocking is acceptable)
workerWasm() → responsiveFast() - If you need non-blocking UI + fast parsing
fastest() → responsiveFast() - Despite the name, this is the correct replacement
strict() → Remove - Or use custom config with strict: true

Choosing the Right Preset

By priority:

Stability first: stable() - Most stable, uses only standard JavaScript APIs
UI responsiveness first: responsive() or balanced() - Non-blocking execution
Parse speed first: fast() - Fastest execution time (blocks main thread)
General-purpose: balanced() - Balanced performance characteristics

By use case:

Server-side parsing: stable() or fast() - Blocking acceptable
Browser with interactive UI: responsive() or balanced() - Non-blocking required
UTF-8 files only: fast() or responsiveFast() - WASM acceleration
Streaming large files: memoryEfficient() or balanced() - Constant memory usage

Example Migration

Before:

import { parseString, EnginePresets } from "web-csv-toolbox";

// Old: Unclear what "fastest" optimizes for
for await (const record of parseString(csv, {
  engine: EnginePresets.fastest(),
})) {
  console.log(record);
}

After:

import { parseString, EnginePresets } from "web-csv-toolbox";

// New: Clear that this optimizes for UI responsiveness + parse speed
for await (const record of parseString(csv, {
  engine: EnginePresets.responsiveFast(),
})) {
  console.log(record);
}

Documentation Updates

All documentation has been updated to reflect the new preset names and include detailed performance characteristics, trade-offs, and use case guidance.

See the Engine Presets Reference for complete documentation.

#608 24f04d7 Thanks @kamiazya! - Add experimental performance tuning options to Engine configuration: backpressureCheckInterval and queuingStrategy

New Experimental Features

Added advanced performance tuning options for fine-grained control over streaming behavior:

engine.backpressureCheckInterval

Controls how frequently the internal parsers check for backpressure during streaming operations (count-based).

Default:
```
{
  lexer: 100,      // Check every 100 tokens processed
  assembler: 10    // Check every 10 records processed
}
```
Trade-offs:
- Lower values: More frequent backpressure checks, more responsive to downstream consumers
- Higher values: Less frequent backpressure checks, reduced checking overhead
Potential Use Cases:
- Memory-constrained environments: Consider lower values for more responsive backpressure
- Scenarios where checking overhead is a concern: Consider higher values
- Slow consumers: Consider lower values to propagate backpressure more quickly
engine.queuingStrategy

Controls the internal queuing behavior of the CSV parser's streaming pipeline.

Default: Designed to balance memory usage and buffering behavior

Structure:
```
{
  lexerWritable?: QueuingStrategy<string>;
  lexerReadable?: QueuingStrategy<Token>;
  assemblerWritable?: QueuingStrategy<Token>;
  assemblerReadable?: QueuingStrategy<CSVRecord<any>>;
}
```
Pipeline Stages:
The CSV parser uses a two-stage pipeline:
1. Lexer: String → Token
2. Assembler: Token → CSVRecord
Each stage has both writable (input) and readable (output) sides:
1. lexerWritable - Lexer input (string chunks)
2. lexerReadable - Lexer output (tokens)
3. assemblerWritable - Assembler input (tokens from lexer)
4. assemblerReadable - Assembler output (CSV records)
Theoretical Trade-offs:
- Small highWaterMark (1-10): Less memory for buffering, backpressure applied more quickly
- Medium highWaterMark (default): Balanced memory and buffering
- Large highWaterMark (100+): More memory for buffering, backpressure applied less frequently
Note: Actual performance characteristics depend on your specific use case and runtime environment. Profile your application to determine optimal values.

Potential Use Cases:
- IoT/Embedded: Consider smaller highWaterMark for minimal memory footprint
- Server-side batch processing: Consider larger highWaterMark for more buffering
- Real-time streaming: Consider smaller highWaterMark for faster backpressure propagation
Usage Examples

Configuration Example: Tuning for Potential High-Throughput Scenarios
```
import { parseString, EnginePresets } from "web-csv-toolbox";

const config = EnginePresets.fastest({
  backpressureCheckInterval: {
    lexer: 200, // Check every 200 tokens (less frequent)
    assembler: 20, // Check every 20 records (less frequent)
  },
  queuingStrategy: {
    lexerReadable: new CountQueuingStrategy({ highWaterMark: 100 }),
    assemblerReadable: new CountQueuingStrategy({ highWaterMark: 50 }),
  },
});

for await (const record of parseString(csv, { engine: config })) {
  console.log(record);
}
```
Memory-Constrained Environment
```
import { parseString, EnginePresets } from "web-csv-toolbox";

const config = EnginePresets.balanced({
  backpressureCheckInterval: {
    lexer: 10, // Check every 10 tokens (frequent checks)
    assembler: 5, // Check every 5 records (frequent checks)
  },
  queuingStrategy: {
    // Minimal buffers throughout entire pipeline
    lexerWritable: new CountQueuingStrategy({ highWaterMark: 1 }),
    lexerReadable: new CountQueuingStrategy({ highWaterMark: 1 }),
    assemblerWritable: new CountQueuingStrategy({ highWaterMark: 1 }),
    assemblerReadable: new CountQueuingStrategy({ highWaterMark: 1 }),
  },
});

for await (const record of parseString(csv, { engine: config })) {
  console.log(record);
}
```
⚠️ Experimental Status

These APIs are marked as experimental and may change in future versions based on ongoing performance research. The default values are designed to work well for most use cases, but optimal values may vary depending on your specific environment and workload.

Recommendation: Only adjust these settings if you're experiencing specific performance issues with large streaming operations or have specific memory/throughput requirements.

Design Philosophy

These options belong to engine configuration because they affect performance and behavior only, not the parsing result specification. This follows the design principle:
- Top-level options: Affect specification (result changes)
- Engine options: Affect performance/behavior (same result, different execution)
#608 24f04d7 Thanks @kamiazya! - feat: introduce "slim" entry point for optimized bundle size

This release introduces a new slim entry point that significantly reduces bundle size by excluding the inlined WebAssembly binary.

New Entry Points

The package now offers two distinct entry points:
1. Main (web-csv-toolbox): The default entry point.
  - Features: Zero-configuration, works out of the box.
  - Trade-off: Includes the WASM binary inlined as base64 (~110KB), resulting in a larger bundle size.
  - Best for: Prototyping, quick starts, or when bundle size is not a critical constraint.
2. Slim (web-csv-toolbox/slim): The new optimized entry point.
  - Features: Smaller bundle size, streaming WASM loading.
  - Trade-off: Requires manual initialization of the WASM binary.
  - Best for: Production applications where bundle size and load performance are critical.
How to use the "Slim" version

When using the slim version, you must manually load the WASM binary before using any WASM-dependent features (like parseStringToArraySyncWASM or high-performance parsing presets).
```
import { loadWASM, parseStringToArraySyncWASM } from "web-csv-toolbox/slim";
// You need to provide the URL to the WASM file
import wasmUrl from "web-csv-toolbox/csv.wasm?url";

async function init() {
  // 1. Manually initialize WASM
  await loadWASM(wasmUrl);

  // 2. Now you can use WASM-powered functions
  const data = parseStringToArraySyncWASM("a,b,c\n1,2,3");
  console.log(data);
}

init();
```
Worker Exports

Corresponding worker exports are also available:
- web-csv-toolbox/worker (Main)
- web-csv-toolbox/worker/slim (Slim)
#608 24f04d7 Thanks @kamiazya! - feat!: add Parser models and streams with improved architecture

Summary

This release introduces a new Parser layer that composes Lexer and Assembler components, providing a cleaner architecture and improved streaming support. The implementation follows the design patterns established by the recently developed CSVObjectRecordAssembler and CSVArrayRecordAssembler.

New Features

Parser Models

FlexibleStringCSVParser
- Composes FlexibleStringCSVLexer and CSV Record Assembler
- Stateful parser for string CSV data
- Supports both object and array output formats
- Streaming mode support via parse(chunk, { stream: true })
- Full options support (delimiter, quotation, columnCountStrategy, etc.)
FlexibleBinaryCSVParser
- Composes TextDecoder with FlexibleStringCSVParser
- Accepts any BufferSource (Uint8Array, ArrayBuffer, or other TypedArray views)
- Uses TextDecoder with stream: true option for proper streaming
- Supports multiple character encodings (utf-8, shift_jis, etc.)
- BOM handling via ignoreBOM option
- Fatal error mode via fatal option
Factory Functions
- createStringCSVParser() - Creates FlexibleStringCSVParser instances
- createBinaryCSVParser() - Creates FlexibleBinaryCSVParser instances
Stream Classes

StringCSVParserStream
- TransformStream<string, CSVRecord> for streaming string parsing
- Wraps Parser instances (not constructing internally)
- Configurable backpressure handling
- Custom queuing strategies support
- Follows existing CSVLexerTransformer pattern
BinaryCSVParserStream
- TransformStream<BufferSource, CSVRecord> for streaming binary parsing
- Accepts any BufferSource (Uint8Array, ArrayBuffer, or other TypedArray views)
- Handles UTF-8 multi-byte characters across chunk boundaries
- Integration-ready for fetch API and file streaming
- Backpressure management with configurable check intervals
Breaking Changes

Object Format Behavior (Reverted)

While initially explored, the final implementation maintains the existing behavior:
- Empty fields (,value,): Filled with ""
- Missing fields (short rows): Remain as undefined
This preserves backward compatibility and allows users to distinguish between explicitly empty fields and missing fields.

Array Format Behavior (No Change)
- Empty fields: Filled with ""
- Missing fields with columnCountStrategy: 'pad': Filled with undefined
Public API Exports (common.ts)

Added exports for:
- FlexibleStringCSVParser
- FlexibleBinaryCSVParser
- createStringCSVParser
- createBinaryCSVParser
- StringCSVParserStream
- BinaryCSVParserStream
Architecture Improvements

Composition Over Implementation
- Parsers compose Lexer + Assembler instead of reimplementing
- Reduces code duplication across the codebase
- Easier to maintain and extend
Streaming Support
- TextDecoder with stream: true for proper multi-byte character handling
- Backpressure handling in Stream classes
- Configurable check intervals for performance tuning
Type Safety
- Maintains full TypeScript strict mode compliance
- Generic type parameters for header types
- Proper CSVRecord type inference based on outputFormat
Migration Guide

For Users of Existing APIs

No changes required. All existing functions (parseString, parseBinary, etc.) continue to work as before.

For Direct Lexer/Assembler Users

Consider migrating to Parser classes for simplified usage:
```
// Before (manual composition)
const lexer = new FlexibleStringCSVLexer(options);
const assembler = createCSVRecordAssembler(options);
const tokens = lexer.lex(csv);
const records = Array.from(assembler.assemble(tokens));

// After (using Parser)
const parser = new FlexibleStringCSVParser(options);
const records = parser.parse(csv);
```
For Stream Users

New stream classes provide cleaner API:
```
// String streaming
const parser = new FlexibleStringCSVParser({ header: ["name", "age"] });
const stream = new StringCSVParserStream(parser);

await fetch("data.csv")
  .then((res) => res.body)
  .pipeThrough(new TextDecoderStream())
  .pipeThrough(stream)
  .pipeTo(yourProcessor);

// Binary streaming
const parser = new FlexibleBinaryCSVParser({ header: ["name", "age"] });
const stream = new BinaryCSVParserStream(parser);

await fetch("data.csv")
  .then((res) => res.body)
  .pipeThrough(stream)
  .pipeTo(yourProcessor);
```
Performance Considerations
- Backpressure check interval defaults to 100 records
- Writable side: 64KB highWaterMark (byte/character counting)
- Readable side: 256 records highWaterMark
- Configurable via queuing strategies
Documentation

All new classes include comprehensive JSDoc documentation with:
- Usage examples
- Parameter descriptions
- Return type documentation
- Remarks on streaming behavior
- Performance characteristics

#608 24f04d7 Thanks @kamiazya! - feat!: add array output format support for CSV parsing

CSV parsing results can now be returned as arrays in addition to objects, with TypeScript Named Tuple support for type-safe column access.

New Features

Array Output Format

Parse CSV data into arrays instead of objects using the outputFormat option:

import { parseString } from "web-csv-toolbox";

const csv = `name,age,city
Alice,30,Tokyo
Bob,25,Osaka`;

// Array output (new)
for await (const record of parseString(csv, { outputFormat: "array" })) {
  console.log(record); // ['Alice', '30', 'Tokyo']
  console.log(record[0]); // 'Alice' - type-safe access with Named Tuples
}

// Object output (default, unchanged)
for await (const record of parseString(csv)) {
  console.log(record); // { name: 'Alice', age: '30', city: 'Tokyo' }
}

Named Tuple Type Support

When headers are provided, array output uses TypeScript Named Tuples for type-safe access:

const csv = `name,age
Alice,30`;

for await (const record of parseString(csv, { outputFormat: "array" })) {
  // record type: { readonly [K in keyof ['name', 'age']]: string }
  // Equivalent to: { readonly 0: string, readonly 1: string, readonly length: 2 }
  console.log(record[0]); // Type-safe: 'Alice'
  console.log(record.length); // 2
}

Include Header Option

Include the header row in the output (array format only):

for await (const record of parseString(csv, {
  outputFormat: "array",
  includeHeader: true,
})) {
  console.log(record);
}
// ['name', 'age', 'city']  ← Header row
// ['Alice', '30', 'Tokyo']
// ['Bob', '25', 'Osaka']

Column Count Strategy

Control how mismatched column counts are handled (array format with header):

const csv = `name,age,city
Alice,30        // Missing 'city'
Bob,25,Osaka,JP // Extra column`;

// Strategy: 'pad' - Pad short rows with undefined, truncate long rows
for await (const record of parseString(csv, {
  outputFormat: "array",
  columnCountStrategy: "pad",
})) {
  console.log(record);
}
// ['Alice', '30', undefined]
// ['Bob', '25', 'Osaka']

// Strategy: 'strict' - Throw error on mismatch
// Strategy: 'truncate' - Truncate long rows, keep short rows as-is
// Strategy: 'keep' - Keep all columns as-is (default)

Available strategies:

'keep' (default): Return rows as-is, regardless of header length
'pad': Pad short rows with undefined, truncate long rows to header length
'strict': Throw ParseError if row length doesn't match header length
'truncate': Truncate long rows to header length, keep short rows as-is

Breaking Changes

CSVRecordAssembler Interface Separation

For better Rust/WASM implementation, the CSVRecordAssembler interface has been separated:

CSVObjectRecordAssembler<Header> - For object format output
CSVArrayRecordAssembler<Header> - For array format output

The unified CSVRecordAssembler<Header, Format> type remains as a deprecated type alias for backward compatibility.

New specialized classes:

import {
  FlexibleCSVObjectRecordAssembler,
  FlexibleCSVArrayRecordAssembler,
  createCSVRecordAssembler,
} from "web-csv-toolbox";

// Option 1: Factory function (recommended)
const assembler = createCSVRecordAssembler({
  outputFormat: "array",
  includeHeader: true,
});

// Option 2: Specialized class for object output
const objectAssembler = new FlexibleCSVObjectRecordAssembler({
  header: ["name", "age"],
});

// Option 3: Specialized class for array output
const arrayAssembler = new FlexibleCSVArrayRecordAssembler({
  header: ["name", "age"],
  columnCountStrategy: "strict",
});

Type structure:

// Before
type CSVRecordAssembler<Header, Format> = {
  assemble(tokens): IterableIterator<CSVRecord<Header, Format>>;
};

// After
interface CSVObjectRecordAssembler<Header> {
  assemble(tokens): IterableIterator<CSVObjectRecord<Header>>;
}

interface CSVArrayRecordAssembler<Header> {
  assemble(tokens): IterableIterator<CSVArrayRecord<Header>>;
}

// Deprecated type alias (backward compatibility)
type CSVRecordAssembler<Header, Format> = Format extends "array"
  ? CSVArrayRecordAssembler<Header>
  : CSVObjectRecordAssembler<Header>;

Migration Guide

For Most Users

No changes required. All existing code continues to work:

// Existing code works without changes
for await (const record of parseString(csv)) {
  console.log(record); // Still returns objects by default
}

Using New Array Output Format

Simply add the outputFormat option:

// New: Array output
for await (const record of parseString(csv, { outputFormat: "array" })) {
  console.log(record); // Returns arrays
}

For Advanced Users Using Low-Level APIs

The existing FlexibleCSVRecordAssembler class continues to work. Optionally migrate to specialized classes:

// Option 1: Continue using FlexibleCSVRecordAssembler (no changes needed)
const assembler = new FlexibleCSVRecordAssembler({ outputFormat: "array" });

// Option 2: Use factory function (recommended)
const assembler = createCSVRecordAssembler({ outputFormat: "array" });

// Option 3: Use specialized classes directly
const assembler = new FlexibleCSVArrayRecordAssembler({
  header: ["name", "age"],
  columnCountStrategy: "pad",
});

Use Cases

Machine Learning / Data Science

// Easily convert CSV to training data arrays
const features = [];
for await (const record of parseString(csv, { outputFormat: "array" })) {
  features.push(record.map(Number));
}

Headerless CSV Files

const csv = `Alice,30,Tokyo
Bob,25,Osaka`;

for await (const record of parseString(csv, {
  outputFormat: "array",
  header: [], // Headerless
})) {
  console.log(record); // ['Alice', '30', 'Tokyo']
}

Type-Safe Column Access

const csv = `name,age,city
Alice,30,Tokyo`;

for await (const record of parseString(csv, { outputFormat: "array" })) {
  // TypeScript knows the tuple structure
  const name: string = record[0]; // Type-safe
  const age: string = record[1]; // Type-safe
  const city: string = record[2]; // Type-safe
}

Benefits

Memory efficiency: Arrays use less memory than objects for large datasets
Type safety: Named Tuples provide compile-time type checking
Flexibility: Choose output format based on your use case
Compatibility: Easier integration with ML libraries and data processing pipelines
Better Rust/WASM support: Separated interfaces simplify native implementation

#608 24f04d7 Thanks @kamiazya! - refactor!: rename core classes and simplify type system

This release contains breaking changes for users of low-level APIs. Most users are not affected.

Breaking Changes

1. Class Naming

Low-level CSV processing classes have been renamed:
```
- import { CSVLexer } from 'web-csv-toolbox';
+ import { FlexibleStringCSVLexer } from 'web-csv-toolbox';

- const lexer = new CSVLexer(options);
+ const lexer = new FlexibleStringCSVLexer(options);
```
For CSV record assembly, use the factory function or specialized classes:
```
- import { CSVRecordAssembler } from 'web-csv-toolbox';
+ import { createCSVRecordAssembler, FlexibleCSVObjectRecordAssembler, FlexibleCSVArrayRecordAssembler } from 'web-csv-toolbox';

- const assembler = new CSVRecordAssembler(options);
+ // Option 1: Use factory function (recommended)
+ const assembler = createCSVRecordAssembler({ outputFormat: 'object', ...options });
+
+ // Option 2: Use specialized class directly
+ const assembler = new FlexibleCSVObjectRecordAssembler(options);
```
2. Type Renaming

The CSV type has been renamed to CSVData:
```
- import type { CSV } from 'web-csv-toolbox';
+ import type { CSVData } from 'web-csv-toolbox';

- function processCSV(data: CSV) {
+ function processCSV(data: CSVData) {
    // ...
  }
```
Bug Fixes
- Fixed stream reader locks not being released when AbortSignal was triggered
- Fixed Node.js WASM module loading
- Improved error handling
Migration Guide

For most users: No changes required if you only use high-level functions like parse(), parseString(), parseBlob(), etc.

For advanced users using low-level APIs:
1. Rename CSV type to CSVData
2. Rename CSVLexer to FlexibleStringCSVLexer
3. Replace CSVRecordAssembler with createCSVRecordAssembler() factory function or specialized classes (FlexibleCSVObjectRecordAssembler / FlexibleCSVArrayRecordAssembler)

Patch Changes

#608 24f04d7 Thanks @kamiazya! - Consolidate and enhance benchmark suite

This changeset focuses on benchmark organization and expansion:

Benchmark Consolidation:
- Integrated 3 separate benchmark files (concurrent-performance.ts, queuing-strategy.bench.ts, worker-performance.ts) into main.ts
- Unified benchmark suite now contains 57 comprehensive tests
- Added conditional Worker support for Node.js vs browser environments
API Migration:
- Migrated from deprecated { execution: ['worker'] } API to new EnginePresets API
- Added tests for all engine presets: mainThread, wasm, worker, workerStreamTransfer, workerWasm, balanced, fastest, strict
Bottleneck Detection:
- Added 23 new benchmarks for systematic bottleneck detection:
  - Row count scaling (50-5000 rows)
  - Field length scaling (10 chars - 10KB)
  - Quote ratio impact (0%-100%)
  - Column count scaling (10-10,000 columns)
  - Line ending comparison (LF vs CRLF)
  - Engine comparison at different scales
Documentation Scenario Coverage:
- Added benchmarks for all scenarios mentioned in documentation
- Included WASM performance tests
- Added custom delimiter tests
- Added parseStringStream tests
- Added data transformation overhead tests
Key Findings:
- Column count is the most critical bottleneck (99.7% slower at 10k columns)
- Field length has non-linear behavior at 1KB threshold
- WASM advantage increases with data size (+18% → +32%)
- Quote processing overhead is minimal (1.1-10% depending on scale)
#608 24f04d7 Thanks @kamiazya! - fix: add charset validation to prevent malicious Content-Type header manipulation

This patch addresses a security vulnerability where malicious or invalid charset values in Content-Type headers could cause parsing failures or unexpected behavior.

Changes:
- Fixed parseMime to handle Content-Type parameters without values (prevents undefined.trim() errors)
- Added charset validation similar to existing compression validation pattern
- Created SUPPORTED_CHARSETS constants for commonly used character encodings
- Added allowNonStandardCharsets option to BinaryOptions for opt-in support of non-standard charsets
- Added error handling in convertBinaryToString to catch TextDecoder instantiation failures
- Charset values are now validated against a whitelist and normalized to lowercase
Security Impact:
- Invalid or malicious charset values are now rejected with clear error messages
- Prevents DoS attacks via malformed Content-Type headers
- Reduces risk of charset-based injection attacks
Breaking Changes: None - existing valid charset values continue to work as before.
#608 24f04d7 Thanks @kamiazya! - Add bundler integration guide for Workers and WebAssembly

This release adds comprehensive documentation for using web-csv-toolbox with modern JavaScript bundlers (Vite, Webpack, Rollup) when using Worker-based or WebAssembly execution.

Package Structure Improvements:
- Moved worker files to root level for cleaner package exports
  - src/execution/worker/helpers/worker.{node,web}.ts → src/worker.{node,web}.ts
- Added ./worker export with environment-specific resolution (node/browser/default)
- Added ./web_csv_toolbox_wasm_bg.wasm export for explicit WASM file access
- Updated internal relative paths in createWorker.{node,web}.ts to reflect new structure
New Documentation:
- How-to Guide: Use with Bundlers - Step-by-step configuration for Vite, Webpack, and Rollup
  - Worker configuration with ?url imports
  - WASM configuration with explicit URL handling
  - WorkerPool reuse patterns
  - Common issues and troubleshooting
- Explanation: Package Exports - Deep dive into environment detection mechanism
  - Conditional exports for node/browser environments
  - Worker implementation differences
  - Bundler compatibility
- Reference: Package Exports - API reference for all package exports
  - Export paths and their resolutions
  - Conditional export conditions
Updated Documentation:

Added bundler usage notes to all Worker and WASM-related documentation:
- README.md
- docs/explanation/execution-strategies.md
- docs/explanation/worker-pool-architecture.md
- docs/how-to-guides/choosing-the-right-api.md
- docs/how-to-guides/wasm-performance-optimization.md
Key Differences: Workers vs WASM with Bundlers

Workers 🟢:
- Bundled automatically as data URLs using ?url suffix
- Works out of the box with Vite
- Example: import workerUrl from 'web-csv-toolbox/worker?url'
WASM 🟡:
- Requires explicit URL configuration via ?url import
- Must call loadWASM(wasmUrl) before parsing
- Example: import wasmUrl from 'web-csv-toolbox/web_csv_toolbox_wasm_bg.wasm?url'
- Alternative: Copy WASM file to public directory
Migration Guide:

For users already using Workers with bundlers, no changes are required. The package now explicitly documents the workerURL option that was previously implicit.

For new users, follow the bundler integration guide:
```
import { parseString, EnginePresets } from "web-csv-toolbox";
import workerUrl from "web-csv-toolbox/worker?url"; // Vite

for await (const record of parseString(csv, {
  engine: EnginePresets.worker({ workerURL: workerUrl }),
})) {
  console.log(record);
}
```
Breaking Changes:

None - this is purely additive documentation and package export improvements. Existing code continues to work without modifications.
#608 24f04d7 Thanks @kamiazya! - Refactor CI workflows to separate TypeScript and Rust environments

This change improves CI efficiency by:
- Splitting setup actions into setup-typescript, setup-rust, and setup-full
- Separating WASM build and TypeScript build jobs with clear dependencies
- Removing unnecessary tool installations from jobs that don't need them
- Clarifying dependencies between TypeScript tests and WASM artifacts
#608 24f04d7 Thanks @kamiazya! - chore: eliminate circular dependencies and improve code quality

This patch improves the internal code structure by eliminating all circular dependencies and adding tooling to prevent future issues.

Changes:
- Introduced madge for circular dependency detection and visualization
- Eliminated circular dependencies:
  - common/types.ts ⇄ utils/types.ts: Merged type definitions into common/types.ts
  - parseFile.ts ⇄ parseFileToArray.ts: Refactored to use direct dependencies
- Fixed import paths in test files to consistently use .ts extension
- Added npm scripts for dependency analysis:
  - check:circular: Detect circular dependencies
  - graph:main: Visualize main entry point dependencies
  - graph:worker: Visualize worker entry point dependencies
  - graph:json, graph:summary, graph:orphans, graph:leaves: Various analysis tools
- Added circular dependency check to CI pipeline (.github/workflows/.build.yaml)
- Updated .gitignore to exclude generated dependency graph files
Impact:
- No runtime behavior changes
- Better maintainability and code structure
- Faster build times due to cleaner dependency graph
- Automated prevention of circular dependency introduction
Breaking Changes: None - this is purely an internal refactoring with no API changes.
#608 24f04d7 Thanks @kamiazya! - docs: comprehensive documentation update and new examples

This release brings significant improvements to the documentation and examples, making it easier to get started and use advanced features.

New Examples

Added comprehensive example projects for various environments and bundlers:
- Deno: examples/deno-main, examples/deno-slim
- Node.js: examples/node-main, examples/node-slim, examples/node-worker-main
- Vite: examples/vite-bundle-main, examples/vite-bundle-slim, examples/vite-bundle-worker-main, examples/vite-bundle-worker-slim
- Webpack: examples/webpack-bundle-worker-main, examples/webpack-bundle-worker-slim
These examples demonstrate:
- How to use the new slim entry point
- Worker integration with different bundlers
- Configuration for Vite and Webpack
- TypeScript setup
Documentation Improvements
- Engine Presets: Detailed guide on choosing the right engine preset for your use case
- Main vs Slim: Explanation of the trade-offs between the main (auto-init) and slim (manual-init) entry points
- WASM Architecture: Updated architecture documentation reflecting the new module structure
- Performance Guide: Improved guide on optimizing performance with WASM and Workers
#608 24f04d7 Thanks @kamiazya! - Expand browser testing coverage and improve documentation

Testing Infrastructure Improvements:
- macOS Browser Testing: Added Chrome and Firefox testing on macOS in CI/CD
  - Vitest 4 stable browser mode enabled headless testing on macOS
  - Previously blocked due to Safari headless limitations
- Parallel Browser Execution: Multiple browsers now run in parallel within each OS job
  - Linux: Chrome + Firefox in parallel
  - macOS: Chrome + Firefox in parallel
  - Windows: Chrome + Firefox + Edge in parallel
- Dynamic Browser Configuration: Browser instances automatically determined by platform
  - Uses process.platform to select appropriate browsers
  - Eliminates need for environment variables
- Explicit Browser Project Targeting: Updated test:browser script to explicitly run only browser tests
  - Added --project browser flag to prevent running Node.js tests during browser test execution
  - Ensures CI jobs run only their intended test suites
Documentation Improvements:
- Quick Overview Section: Added comprehensive support matrix and metrics
  - Visual support matrix showing all environment/platform combinations
  - Tier summary with coverage statistics
  - Testing coverage breakdown by category
  - Clear legend explaining all support status icons
- Clearer Support Tiers: Improved distinction between support levels
  - ✅ Full Support (Tier 1): Tested and officially supported
  - 🟡 Active Support (Tier 2): Limited testing, active maintenance
  - 🔵 Community Support (Tier 3): Not tested, best-effort support
- Cross-Platform Runtime Support: Clarified Node.js and Deno support across all platforms
  - Node.js LTS: Tier 1 support on Linux, macOS, and Windows
  - Deno LTS: Tier 2 support on Linux, macOS, and Windows
  - Testing performed on Linux only due to cross-platform runtime design
  - Eliminates unnecessary concern about untested platforms
- Simplified Tables: Converted redundant tables to concise bullet lists
  - Removed repetitive "Full Support" entries
  - Easier to scan and understand
Browser Testing Coverage:
- Chrome: Tested on Linux, macOS, and Windows (Tier 1)
- Firefox: Tested on Linux, macOS, and Windows (Tier 1)
- Edge: Tested on Windows only (Tier 1)
- Safari: Community support (headless mode not supported by Vitest)
Breaking Changes:

None - this release only improves testing infrastructure and documentation.
#608 24f04d7 Thanks @kamiazya! - Add regression tests and documentation for prototype pollution safety

This changeset adds comprehensive tests and documentation to ensure that CSVRecordAssembler does not cause prototype pollution when processing CSV headers with dangerous property names.

Security Verification:
- Verified that Object.fromEntries() is safe from prototype pollution attacks
- Confirmed that dangerous property names (__proto__, constructor, prototype) are handled safely
- Added 8 comprehensive regression tests in FlexibleCSVRecordAssembler.prototype-safety.test.ts
Test Coverage:
- Tests with __proto__ as CSV header
- Tests with constructor as CSV header
- Tests with prototype as CSV header
- Tests with multiple dangerous property names
- Tests with multiple records
- Tests with quoted fields
- Baseline tests documenting Object.fromEntries() behavior
Documentation:
- Added detailed safety comments to all Object.fromEntries() usage in CSVRecordAssembler
- Documented why the implementation is safe from prototype pollution
- Added references to regression tests for verification
Conclusion:
The AI security report suggesting prototype pollution vulnerability was a false positive. Object.fromEntries() creates own properties (not prototype properties), making it inherently safe from prototype pollution attacks. This changeset provides regression tests to prevent future concerns and documents the safety guarantees.
#608 24f04d7 Thanks @kamiazya! - Improve Rust/WASM development environment and add comprehensive tests

Internal Improvements
- Migrated from Homebrew Rust to rustup for better toolchain management
- Updated Rust dependencies to latest versions (csv 1.4, wasm-bindgen 0.2.105, serde 1.0.228)
- Added 10 comprehensive unit tests for CSV parsing functionality
- Added Criterion-based benchmarks for performance tracking
- Improved error handling in WASM bindings
- Configured rust-analyzer and development tools (rustfmt, clippy)
- Added pkg/ directory to .gitignore (build artifacts should not be tracked)
- Added Rust tests to CI pipeline (GitHub Actions Dynamic Tests workflow)
- Integrated Rust coverage with Codecov (separate from TypeScript with rust flag)
- Integrated Rust benchmarks with CodSpeed for performance regression detection
These changes improve code quality and maintainability without affecting the public API or functionality.
#608 24f04d7 Thanks @kamiazya! - chore: upgrade Biome to 2.3.4 and update configuration

Upgraded development dependency @biomejs/biome from 1.9.4 to 2.3.4 and updated configuration for compatibility with Biome v2. This change has no impact on the runtime behavior or public API.
#608 24f04d7 Thanks @kamiazya! - chore: upgrade TypeScript to 5.9.3 and typedoc to 0.28.14 with enhanced documentation

Developer Experience Improvements:
- Upgraded TypeScript from 5.8.3 to 5.9.3
- Upgraded typedoc from 0.28.5 to 0.28.14
- Enabled strict type checking options (noUncheckedIndexedAccess, exactOptionalPropertyTypes)
- Enhanced TypeDoc configuration with version display, improved sorting, and navigation
- Integrated all documentation markdown files with TypeDoc using native projectDocuments support
- Added YAML frontmatter to all documentation files for better organization
Type Safety Enhancements:
- Added explicit | undefined to all optional properties for stricter type checking
- Added proper undefined checks for array/object indexed access
- Improved TextDecoderOptions usage to avoid explicit undefined values
Documentation Improvements:
- Enhanced TypeDoc navigation with categories, groups, and folders
- Added sidebar and navigation links to GitHub and npm
- Organized documentation into Tutorials, How-to Guides, Explanation, and Reference sections
- Improved documentation discoverability with YAML frontmatter grouping
Breaking Changes: None - all changes are backward compatible
#608 24f04d7 Thanks @kamiazya! - feat(wasm): add input size validation and source option for error reporting

This patch enhances the WASM CSV parser with security improvements and better error reporting capabilities.

Security Enhancements:
- Input Size Validation: Added validation to prevent memory exhaustion attacks
  - Validates CSV input size against maxBufferSize parameter before processing
  - Returns clear error message when size limit is exceeded
  - Default limit: 10MB (configurable via TypeScript options)
  - Addresses potential DoS vulnerability from maliciously large CSV inputs
Error Reporting Improvements:
- Source Option: Added optional source parameter for better error context
  - Allows specifying a source identifier (e.g., filename) in error messages
  - Error format: "Error message in \"filename\""
  - Significantly improves debugging when processing multiple CSV files
  - Aligns with TypeScript implementation's CommonOptions.source
Performance Optimizations:
- Optimized format_error() to take ownership of String
  - Avoids unnecessary allocation when source is None
  - Improves error path performance by eliminating to_string() call
  - Zero-cost abstraction in the common case (no source identifier)
Code Quality Improvements:
- Used bool::then_some() for more idiomatic Option handling
- Fixed Clippy needless_borrow warnings in tests
- Applied cargo fmt formatting for consistency
Implementation Details:

Rust (web-csv-toolbox-wasm/src/lib.rs):
- Added format_error() helper function for consistent error formatting
- Updated parse_csv_to_json() to accept max_buffer_size and source parameters
- Implemented input size validation at parse entry point
- Applied source context to all error types (headers, records, JSON serialization)
TypeScript (src/parseStringToArraySyncWASM.ts):
- Updated to pass maxBufferSize from options to WASM function
- Updated to pass source from options to WASM function
Breaking Changes: None - this is a backward-compatible enhancement with sensible defaults.

Migration: No action required. Existing code continues to work without modification.

Uh oh!

web-csv-toolbox@0.14.0

Minor Changes

Summary

Breaking Changes

API Renaming for Consistency

BufferSource Support

Migration Guide

Automatic Migration

TypeScript Users

Updated API Consistency

Documentation Updates

README.md

API Reference (docs/reference/package-exports.md)

Architecture Guide (docs/explanation/parsing-architecture.md)

Flush Procedure Clarification

Type Safety

New Feature

Benchmark Results

Usage

Special Values

Security Note

Design Philosophy

Breaking Changes

1. EngineConfig Type Structure

2. Type Safety Improvements

3. EnginePresets Options Split

4. Transformer Constructor Changes

New Features

1. Default Strategy Constants

2. Type Tests

Migration Guide

For TypeScript Users

For EnginePresets Users

For Transformer Users

Benefits

Breaking Changes

Engine Preset Renaming

Removed Presets

Improvements

Clearer Performance Documentation

Trade-offs Transparency

Migration Guide

Quick Migration

Choosing the Right Preset

Example Migration

Documentation Updates

New Experimental Features

engine.backpressureCheckInterval

engine.queuingStrategy

Usage Examples

Configuration Example: Tuning for Potential High-Throughput Scenarios

Memory-Constrained Environment

⚠️ Experimental Status

Design Philosophy

New Entry Points

How to use the "Slim" version

Worker Exports

Summary

New Features

Parser Models

FlexibleStringCSVParser

FlexibleBinaryCSVParser

Factory Functions

Stream Classes

StringCSVParserStream

BinaryCSVParserStream

Breaking Changes

Object Format Behavior (Reverted)

Array Format Behavior (No Change)

Public API Exports (common.ts)

Architecture Improvements

Composition Over Implementation

Streaming Support

Type Safety

Migration Guide

For Users of Existing APIs

For Direct Lexer/Assembler Users

For Stream Users

Performance Considerations

`engine.backpressureCheckInterval`

`engine.queuingStrategy`