Skip to content

Conversation

@NathanFlurry
Copy link
Member

No description provided.

@vercel
Copy link

vercel bot commented Nov 13, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
rivetkit-serverless Error Error Nov 13, 2025 10:50pm
3 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
rivet-cloud Ignored Ignored Nov 13, 2025 10:50pm
rivet-inspector Ignored Ignored Preview Nov 13, 2025 10:50pm
rivet-site Ignored Ignored Preview Nov 13, 2025 10:50pm

Copy link
Member Author

NathanFlurry commented Nov 13, 2025


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Nov 13, 2025

PR Review: Hibernating WebSocket Protocol Implementation

Overview

This PR implements a new hibernating WebSocket protocol in RivetKit. The changes refactor WebSocket lifecycle management and message acknowledgment handling. Overall the code quality is good with clear separation of concerns, but there are several issues to address before merging.

Critical Issues

1. Incomplete WebSocket Adapter Constructor Call (tunnel.ts:139-193)

The #createWebSocket method has a critical bug. The constructor signature expects 9 parameters but only receives 6:

// tunnel.ts:148-186
const adapter = new WebSocketTunnelAdapter(
    requestIdStr,
    hibernatable,
    request,
    (data: ArrayBuffer | string, isBinary: boolean) => { /* ... */ },
    (code?: number, reason?: string, retry: boolean = false) => { /* ... */ },
);

But the constructor signature (websocket-tunnel-adapter.ts:49-73) requires:

constructor(
    tunnel: Tunnel,          // MISSING
    actorId: string,         // MISSING  
    requestId: string,       // PROVIDED
    hibernatable: boolean,   // PROVIDED
    request: Request,        // PROVIDED
    messageIndex: number,    // MISSING
    persistHibernatableWebSocketMessageIndex: (messageIndex: number) => void, // MISSING
    sendCallback: (data: ArrayBuffer | string, isBinary: boolean) => void,    // PROVIDED
    closeCallback: (code?: number, reason?: string, retry?: boolean) => void, // PROVIDED
)

This will cause a runtime error. The parameters are misaligned - tunnel, actorId, messageIndex, and the persist callback are missing.

2. Loop Variable Declaration Error (tunnel.ts:116)

for (const i = 0; i < requestIds.length; i++) {

Should be:

for (let i = 0; i < requestIds.length; i++) {

const cannot be used with reassignable loop counters.

3. Inconsistent Property Access (websocket-tunnel-adapter.ts:97)

The adapter references ws.canHibernate in tunnel.ts:97 (old code path) but the new implementation changed this to ws.hibernatable. Ensure all references are updated consistently.

4. Missing Request Parameter (tunnel.ts:120-126)

When calling #createWebSocket during restoration:

this.#createWebSocket(
    actorId,
    requestId,
    idToStr(requestId),
    true,
    messageIndex,
);  // Missing 'request' parameter

The method signature requires a Request parameter but it's not provided here. You'll need to reconstruct the request from stored metadata or adjust the restoration flow.

High Priority Issues

5. Breaking API Change Without Migration Path (mod.ts:60-66)

The websocket handler changed from async returning Promise<void> to synchronous returning void:

// Before
websocket?: (runner, actorId, ws, requestId, request) => Promise<void>;

// After  
websocket: (runner, actorId, ws, requestId, request) => void;

This is a breaking change that will affect all existing consumers. Consider:

  • Adding deprecation warnings
  • Providing a migration guide
  • Supporting both signatures temporarily with adapter logic

6. Unused Symbols Imported (tunnel.ts:12-16)

import {
    SET_CAN_HIBERNATE_SYMBOL,
    SET_REQUEST_SYMBOL,
    WebSocketTunnelAdapter,
} from "./websocket-tunnel-adapter";

SET_CAN_HIBERNATE_SYMBOL and SET_REQUEST_SYMBOL are imported but never used in the file. Either use them or remove the imports.

7. Missing Validation (tunnel.ts:104-134)

The restoreHibernatingRequests method should validate:

  • That actorId exists and is valid
  • That requestIds array is not empty
  • That message indices are within valid range (0-65535 per tunnel.ts:771)

8. Incomplete TODO Comments (tunnel.ts:192, websocket-tunnel-adapter.ts:517)

Several critical TODOs are left in the code:

  • // TODO: assign request to ws so we can pass this through (tunnel.ts:192)
  • // TODO: empty comment (websocket-tunnel-adapter.ts:517 in dispatchEvent)

These should either be implemented or documented with follow-up issue numbers.

Medium Priority Issues

9. Error Handling in Message Sequence Logic (websocket-tunnel-adapter.ts:103-157)

The message sequencing logic is well-designed, but consider edge cases:

  • What happens if messageIndex wraps around after 65535?
  • Should there be a maximum gap threshold before closing?
  • The gap calculation on line 150 could overflow for extreme cases

10. Race Condition Potential (tunnel.ts:662-674)

There's a check for duplicate WebSockets, but in a concurrent scenario:

const existingAdapter = this.#actorWebSockets.get(requestIdStr);
if (existingAdapter) {
    existingAdapter.__closeWithoutCallback(1000, "ws.duplicate_open");
}

If two open messages arrive simultaneously, both could pass the check before either is added to the map. Consider using a lock or atomic operation.

11. Memory Leak Risk (tunnel.ts:97)

The check if (!ws.canHibernate) means hibernatable WebSockets are NOT closed on shutdown:

if (!ws.canHibernate) {
    ws.__closeWithoutCallback(1000, "ws.tunnel_shutdown");
}

But they're immediately cleared from the map on line 101. This means the WebSocket objects remain in memory without being properly cleaned up. Should hibernatable WebSockets be persisted to storage here?

Minor Issues / Code Quality

12. Logging Inconsistencies

  • Some log messages use lowercase (tunnel.ts:129: "restored hibernatable requests")
  • Others use mixed case (per CLAUDE.md convention)
  • Consider standardizing: log messages should be lowercase unless mentioning code symbols

13. Magic Numbers (websocket-tunnel-adapter.ts:154)

this.close(1008, closeReason);

Consider defining WebSocket close codes as named constants:

const WS_CLOSE_POLICY_VIOLATION = 1008;

14. Type Safety (tunnel.ts:145-146)

Parameters hibernatable and messageIndex are passed positionally. With 6+ parameters, consider using an options object:

#createWebSocket(options: {
    actorId: string;
    requestId: RequestId;
    requestIdStr: string;
    hibernatable: boolean;
    messageIndex: number;
    request: Request;
}): WebSocketTunnelAdapter

15. Commented Out Code (actor-driver.ts:81-87)

Large blocks of commented code should be removed:

// // WebSocket message acknowledgment debouncing for hibernatable websockets
// #hibernatableWebSocketAckQueue: Map<...> = new Map();
// #wsAckFlushInterval?: NodeJS.Timeout;

If this code might be needed, reference the commit SHA in a comment instead.

16. OpenAPI Changes Not Reviewed (rivetkit-openapi/openapi.json)

The PR removes 277 lines from the OpenAPI spec. Ensure:

  • These are intentional removals, not accidental
  • Any deprecated endpoints are documented
  • Client SDKs are updated accordingly

Performance Considerations

17. UUID String Conversion (tunnel.ts:807-809)

idToStr() is called frequently and performs string conversion. Consider:

  • Caching string representations where possible
  • Using the ArrayBuffer directly as map keys (with custom comparison)
  • Profiling to see if this is a bottleneck

18. Missing Index Validation Before Persistence (websocket-tunnel-adapter.ts:160-163)

Before persisting the message index, validate it's in the acceptable range:

if (messageIndex < 0 || messageIndex > 65535) {
    throw new Error(`Invalid message index: ${messageIndex}`);
}
this.messageIndex = messageIndex;

Testing Recommendations

  1. Add test for WebSocket restoration flow - Verify that restoreHibernatingRequests correctly recreates WebSocket state
  2. Test message sequence validation - Cover duplicate, skipped, and out-of-order messages
  3. Test concurrent open messages - Verify race condition handling
  4. Test error paths - Ensure errors in config.websocket handler properly close connections
  5. Load testing - Verify performance with many hibernating WebSockets

Security Considerations

19. Message Index Validation

The message index sequence validation is good, but ensure:

  • Indices cannot be manipulated to bypass sequence checks
  • The range validation at tunnel.ts:771-772 is consistently applied
  • Consider rate limiting if sequence violations occur repeatedly

Documentation

20. Missing JSDoc

Key methods lack documentation:

  • restoreHibernatingRequests - Should explain the restoration contract
  • #createWebSocket - Should document when each parameter is used
  • persistHibernatableWebSocketMessageIndex callback - Needs contract documentation

Positive Aspects

✅ Good separation of concerns between tunnel and adapter
✅ Clear lifecycle management with explicit state transitions
✅ Comprehensive logging for debugging
✅ Proper error handling in most paths
✅ Message sequencing logic is thorough and well-commented
✅ Cleanup logic on shutdown is mostly correct

Recommendations

Before merging:

  1. MUST FIX: Constructor parameter mismatch (Issue [SVC-2555] Set up issue templates #1)
  2. MUST FIX: Loop variable declaration (Issue [SVC-2479] Send cluster events to PostHog #2)
  3. MUST FIX: Missing request parameter in restoration (Issue [SVC-2483] Remove hardcoded uses of rivet.gg #4)
  4. 🔴 SHOULD FIX: Breaking API changes need migration path (Issue [SVC-2358] Enable Redis memory overcommit #5)
  5. 🔴 SHOULD FIX: Complete or document TODOs (Issue Remove fallible uses of as in api services #8)

After merging:

  • Create follow-up issues for hibernatable WebSocket persistence strategy
  • Add comprehensive integration tests
  • Update documentation with new protocol details

Summary

The refactoring improves the WebSocket lifecycle management significantly, but critical bugs prevent this from being merged as-is. The constructor parameter mismatch (Issue #1) will cause immediate runtime failures. After addressing the critical issues, this will be a solid improvement to the codebase.

@NathanFlurry NathanFlurry force-pushed the 11-11-fix_guard_dont_send_open_msg_for_hibernating_ws_hibernating_ws_keepalive branch from f45c96f to 769c3f8 Compare November 13, 2025 21:39
@NathanFlurry NathanFlurry force-pushed the 11-13-chore_rivetkit_implement_new_hibernating_ws_protocol branch from dbe19ca to 426a752 Compare November 13, 2025 21:39
@claude
Copy link

claude bot commented Nov 13, 2025

Code Review: Hibernating WebSocket Protocol Implementation

This PR implements a new hibernating WebSocket protocol that significantly refactors how WebSocket connections are managed, particularly around hibernation and message acknowledgment. Here's my comprehensive review:

🎯 Overall Assessment

Strengths:

  • Clear separation of concerns with improved architecture
  • Better handling of hibernatable WebSocket lifecycle
  • More robust message index tracking and validation
  • Removal of event buffering complexity

Areas of Concern:

  • Several incomplete TODOs and commented code
  • Missing required callback parameters
  • Potential compilation/runtime errors in TypeScript
  • Some logic gaps in the new implementation

🐛 Critical Issues

1. Compilation Error in tunnel.ts:116-126

for (const i = 0; i < requestIds.length; i++) {

This should be let i not const i. This will cause a compilation error.

Location: engine/sdks/typescript/runner/src/tunnel.ts:116

2. Missing Parameters in #createWebSocket

The #createWebSocket method is called on line 120-126 with 5 arguments, but the function signature requires 6 parameters (including request: Request).

// Line 120-126 - Missing 'request' parameter
this.#createWebSocket(
    actorId,
    requestId,
    idToStr(requestId),
    true,
    messageIndex,
);

Location: engine/sdks/typescript/runner/src/tunnel.ts:120-126
Fix: Either provide a placeholder Request object or make the request parameter optional in the function signature.

3. API Signature Changes Without Migration

The RunnerConfig.websocket callback changed from async returning Promise<void> to synchronous returning void. This is a breaking change that may affect existing implementations.

Old: websocket?: (runner, actorId, ws, requestId, request) => Promise<void>
New: websocket: (runner, actorId, ws, requestId, request) => void

Location: engine/sdks/typescript/runner/src/mod.ts:60-66
Impact: Any existing code awaiting this callback will break.

4. Incomplete WebSocketTunnelAdapter Constructor

The constructor signature in websocket-tunnel-adapter.ts appears to be missing required parameters based on usage. The constructor expects parameters like tunnel, actorId, requestId, etc., but #createWebSocket in tunnel.ts doesn't provide all of them.

Location: engine/sdks/typescript/runner/src/websocket-tunnel-adapter.ts:49-79

5. Deleted Assert Function Not Replaced

In runner-protocol/src/index.ts, the local assert function was removed in favor of importing from node:assert, but this may not work in all environments (browsers, Deno without Node compatibility).

Location: engine/sdks/typescript/runner-protocol/src/index.ts:1


⚠️ Major Concerns

1. Missing Implementation: restoreHibernativeRequests

The new required callback restoreHibernativeRequests in RunnerConfig is not implemented in the diff. This will cause runtime errors.

restoreHibernativeRequests(
    actorId: string,
    requestId: protocol.RequestId[],
): number[];

Location: engine/sdks/typescript/runner/src/mod.ts:79-82
Impact: This is a required callback but no implementation is shown in the actor-driver.ts changes.

2. Inconsistent API: getActorHibernationConfig → canWebSocketHibernate

The old getActorHibernationConfig returned a HibernationConfig object with both enabled and lastMsgIndex. The new canWebSocketHibernate only returns a boolean. However, the actor-driver.ts file still references the old API (lines 138-296 in the diff).

This indicates incomplete refactoring.

Location: Multiple files

3. Commented Out Code

The actor-driver.ts has large blocks of commented code that should be removed:

  • Lines 81-86: Commented hibernatable WebSocket ack queue
  • Lines 316-319: References to deleted #wsAckFlushInterval
  • Lines 338-349: Implementation of #flushHibernatableWebSocketAcks() that references deleted properties
  • Lines 645-672: Commented WebSocket message index update logic

Recommendation: Either complete the implementation or remove the dead code.


📝 Code Quality Issues

1. Type Safety: Unused Symbols

import {
    SET_CAN_HIBERNATE_SYMBOL,
    SET_REQUEST_SYMBOL,
    WebSocketTunnelAdapter,
} from "./websocket-tunnel-adapter";

These symbols are imported but never used in the file.

Location: engine/sdks/typescript/runner/src/tunnel.ts:13-15

2. Incomplete Error Handling

In tunnel.ts:538, there's a typo in the error log:

this.log?.debug({ msg: "catught runner shutdown error" });

Should be "caught" not "catught".

Location: engine/sdks/typescript/runner/src/tunnel.ts:538

3. Missing Validation in Message Index Tracking

The _handleMessage method validates message indices but always calls #persistHibernatableWebSocketMessageIndex even for non-hibernatable WebSockets when the condition might be wrong:

if (this.hibernatable) {
    this.#persistHibernatableWebSocketMessageIndex(messageIndex);
}

This is correct, but the callback is never defined for non-hibernatable WebSockets, which could lead to runtime errors.

Location: engine/sdks/typescript/runner/src/websocket-tunnel-adapter.ts:161-163

4. Inconsistent Property Access

The #log getter in WebSocketTunnelAdapter accesses this.#tunnel.log but #tunnel is only stored after construction:

constructor(...) {
    this.#tunnel = tunnel;  // Set here
    ...
}

get #log(): Logger | undefined {
    return this.#tunnel.log;  // Used here - could be undefined during construction
}

Location: engine/sdks/typescript/runner/src/websocket-tunnel-adapter.ts:74, 45-46


🔒 Security & Performance

1. Message Index Validation

Good: The code properly validates message indices for hibernatable WebSockets and closes connections on out-of-sequence messages.

if (messageIndex !== previousIndex + 1) {
    this.close(1008, closeReason);
    return true;
}

Location: engine/sdks/typescript/runner/src/websocket-tunnel-adapter.ts:139-156

2. Memory Leak Prevention

The code properly cleans up event listeners and pending messages on shutdown. However, the garbage collection interval (60 seconds) might be too long for high-throughput scenarios.

Recommendation: Consider making GC_INTERVAL configurable.

Location: engine/sdks/typescript/runner/src/tunnel.ts:18


✅ Positive Changes

  1. Improved Architecture: The separation of WebSocket creation logic into #createWebSocket is cleaner.

  2. Better Logging: Structured logging with proper context (actorId, requestId, messageIndex) throughout.

  3. Duplicate Prevention: Handling of duplicate WebSocket open events is robust.

  4. Synchronous WebSocket Handler: Making the websocket callback synchronous (non-async) is actually better for event listener setup, preventing race conditions.

  5. Message Index Tracking: The new hibernatable WebSocket message tracking is more explicit and easier to reason about.


📋 Recommendations

High Priority

  1. Fix the compilation error (const → let in for loop)
  2. Implement restoreHibernativeRequests callback or document where it's implemented
  3. Fix #createWebSocket parameter mismatch
  4. Remove all commented/dead code or complete the implementation
  5. Reconcile the API changes between getActorHibernationConfig and canWebSocketHibernate

Medium Priority

  1. Fix typo: "catught" → "caught"
  2. Remove unused imports (SET_CAN_HIBERNATE_SYMBOL, SET_REQUEST_SYMBOL)
  3. Add migration guide for breaking changes to websocket callback
  4. Consider adding tests for message index validation logic

Low Priority

  1. Make GC_INTERVAL configurable
  2. Add JSDoc comments for the new hibernation protocol
  3. Consider extracting magic numbers to constants (e.g., WebSocket close codes)

🧪 Test Coverage

Missing Test Coverage:

  • Message index validation for hibernatable WebSockets
  • Restoration of hibernated WebSockets with message indices
  • Duplicate message handling
  • Out-of-sequence message handling
  • WebSocket close during hibernation

Recommendation: Add integration tests for the hibernation flow.


📊 Summary

Category Rating Notes
Architecture ⭐⭐⭐⭐ Good separation of concerns
Code Quality ⭐⭐⭐ Some dead code and TODOs
Type Safety ⭐⭐ Missing parameters, incomplete refactoring
Security ⭐⭐⭐⭐ Good validation of message indices
Performance ⭐⭐⭐⭐ Efficient message handling
Test Coverage ⭐⭐ Missing tests for critical paths

Overall: Needs Work ⚠️

This PR has good architectural improvements but requires fixing critical issues before merging:

  1. Compilation errors
  2. Missing required callback implementations
  3. Incomplete API refactoring
  4. Dead code cleanup

Once these issues are addressed, this will be a solid improvement to the WebSocket hibernation system.

@claude
Copy link

claude bot commented Nov 13, 2025

PR Review: chore(rivetkit): implement new hibernating ws protocol

Overall, this is a solid refactor that improves the hibernating WebSocket protocol implementation. The changes simplify the architecture and improve reliability. Below are my detailed findings:

✅ Strengths

  1. Improved Lifecycle Management: The refactor centralizes WebSocket creation in #createWebSocket(), making it easier to manage both new and restored hibernating WebSockets consistently.

  2. Better Separation of Concerns: Moving WebSocket lifecycle logic out of event buffering improves code clarity. The removal of buffered events simplifies the adapter significantly.

  3. Enhanced Message Ordering: The message index validation in websocket-tunnel-adapter.ts:123-157 properly handles duplicate and out-of-sequence messages with clear error handling and logging.

  4. Cleaner API:

    • websocket handler is now synchronous (returns void instead of Promise<void>), which better reflects its actual behavior
    • Renamed sendWebsocketMessageAcksendHibernatableWebSocketMessageAck for clarity
    • Changed getActorHibernationConfigcanWebSocketHibernate with simpler return type
  5. Consistent Structured Logging: Good use of structured logging throughout, following the project's conventions.

🐛 Potential Bugs

  1. Missing request parameter in #createWebSocket call (tunnel.ts:120-126)

    this.#createWebSocket(
        actorId,
        requestId,
        idToStr(requestId),
        true,
        messageIndex,
        // Missing request parameter!
    );

    The function signature at line 139 requires a request: Request parameter, but it's not being passed when restoring hibernating requests. This will cause a runtime error.

    Fix: You need to either:

    • Pass the request object (likely need to restore it from hibernation metadata)
    • Make the parameter optional with a default value
    • Restructure to have separate methods for creating new vs restored WebSockets
  2. Synchronous handler called but errors not caught (tunnel.ts:703-709)
    The config.websocket handler is now synchronous but called without try-catch inside the outer try-catch. If it throws synchronously, it's caught correctly, but this could be made more explicit.

  3. Index bounds check may be too restrictive (tunnel.ts:771-772)

    if (index < 0 || index > 65535)
        throw new Error("invalid websocket ack index");

    This assumes message indices fit in a u16 (0-65535), but long-lived WebSocket connections might exceed this. Consider if BigInt or larger range is needed, or document the limitation.

⚠️ Code Quality Issues

  1. Typo in log message (tunnel.ts:538)

    this.log?.debug({ msg: "catught runner shutdown error" });

    Should be "caught" not "catught".

  2. Incomplete TODO (tunnel.ts:192)

    // TODO: assign request to ws so we can pass this through

    This TODO at the end of #createWebSocket should be either completed or removed if no longer relevant.

  3. Unused imports (websocket-tunnel-adapter.ts:12-15)

    import {
        SET_CAN_HIBERNATE_SYMBOL,
        SET_REQUEST_SYMBOL,
        WebSocketTunnelAdapter,
    } from "./websocket-tunnel-adapter";

    The symbols SET_CAN_HIBERNATE_SYMBOL and SET_REQUEST_SYMBOL are imported in tunnel.ts but don't appear to be used anywhere. These should be removed if not needed.

  4. Incomplete dispatchEvent implementation (websocket-tunnel-adapter.ts:516-519)

    dispatchEvent(event: any): boolean {
        // TODO:
        return true;
    }

    Either implement this properly or add a comment explaining why it's acceptable to leave unimplemented.

  5. Constructor signature is complex (websocket-tunnel-adapter.ts:49-73)
    The WebSocket adapter constructor now takes 9 parameters, several marked @experimental. Consider:

    • Using an options object for better maintainability
    • Documenting what "experimental" means for these public properties
    • Clarifying the TODOs about making some properties internal-only

🔒 Security Considerations

  1. No validation on messageIndex persistence callback
    The persistHibernatableWebSocketMessageIndex callback is called with user-controlled index values. Ensure the implementation properly validates and persists this securely to prevent manipulation.

  2. Path validation in buildRequestForWebSocket (tunnel.ts:828-830)
    Good that you validate the path starts with /, but consider additional validation:

    • Path traversal attempts (../)
    • Maximum path length
    • Invalid characters

🚀 Performance Considerations

  1. Removed event buffering - This is actually a performance improvement. The old implementation buffered events which could lead to memory issues with long-lived connections or slow consumers. The new approach is more direct and efficient.

  2. Message index bounds - The u16 limit on message indices (if that's the intent) means WebSockets will need to close/reopen after 65,536 messages. For high-throughput applications, this could cause connection churn. Document this limitation if it's intentional.

📝 Documentation

  1. API changes need documentation: The breaking changes to RunnerConfig (especially websocket, restoreHibernativeRequests, canWebSocketHibernate) should be documented in migration notes or changelog.

  2. HibernationConfig interface change: The structure changed from { enabled: boolean; lastMsgIndex: number | undefined } to just the config being optional. This is cleaner but needs documentation for users.

🧪 Test Coverage

The PR doesn't include test changes. Consider adding tests for:

  • Message index validation (duplicate, skipped, out-of-order)
  • WebSocket restoration from hibernation
  • Error handling paths in #createWebSocket
  • The new synchronous websocket handler behavior

Recommendations

High Priority:

  1. 🔴 Fix the missing request parameter in restoreHibernatingRequests - this will cause runtime errors
  2. 🟡 Fix typo: "catught" → "caught"
  3. 🟡 Remove unused imports or use them

Medium Priority:
4. Document the message index bounds and any limitations
5. Consider using an options object for WebSocketTunnelAdapter constructor
6. Add migration notes for API changes
7. Implement or document the incomplete dispatchEvent

Low Priority:
8. Consider path validation improvements
9. Add test coverage for new hibernation logic
10. Review and complete/remove TODOs

Great work on simplifying the architecture! Just needs the critical bug fix before merging.

@NathanFlurry NathanFlurry force-pushed the 11-11-fix_guard_dont_send_open_msg_for_hibernating_ws_hibernating_ws_keepalive branch from 769c3f8 to fc8f1fc Compare November 13, 2025 22:27
@NathanFlurry NathanFlurry force-pushed the 11-13-chore_rivetkit_implement_new_hibernating_ws_protocol branch from 426a752 to 04be372 Compare November 13, 2025 22:27
@NathanFlurry NathanFlurry force-pushed the 11-13-chore_rivetkit_implement_new_hibernating_ws_protocol branch from 04be372 to 9d6c4c8 Compare November 13, 2025 22:35
@NathanFlurry NathanFlurry force-pushed the 11-11-fix_guard_dont_send_open_msg_for_hibernating_ws_hibernating_ws_keepalive branch from fc8f1fc to e5ca471 Compare November 13, 2025 22:35
@claude
Copy link

claude bot commented Nov 13, 2025

PR Review: Hibernating WebSocket Protocol Implementation

Critical Issues

1. Import Error in websocket-tunnel-adapter.ts (Line 7)

Severity: High - Compilation Error

import { wrappingLte } from "./utils";

This import is incorrect. The function wrappingLte does not exist in utils.ts. The correct function name is wrappingLteU16.

Line 124 uses this non-existent function:

if (wrappingLte(messageIndex, previousIndex)) {

Fix: Change the import to:

import { wrappingLteU16 } from "./utils";

And update line 124 to:

if (wrappingLteU16(messageIndex, previousIndex)) {

2. Missing request Parameter in tunnel.ts (Line 125-126)

Severity: High - Runtime Error

In tunnel.ts:125-126, #createWebSocket is called without the required request parameter:

this.#createWebSocket(
    actorId,
    requestId,
    idToStr(requestId),
    true,
    messageIndex,
);

But the function signature at line 139-146 requires a request parameter:

#createWebSocket(
    actorId: string,
    requestId: RequestId,
    requestIdStr: string,
    hibernatable: boolean,
    messageIndex: number,
    request: Request,  // <-- Missing in the call above
): WebSocketTunnelAdapter

Impact: This will cause runtime errors when trying to restore hibernating WebSockets.

Fix: You need to either:

  1. Store the request metadata during hibernation and restore it, or
  2. Make the request parameter optional for restoration scenarios and build a dummy request

3. Incomplete Migration in actor-driver.ts

Severity: Medium - Dead Code

The code contains several issues related to the protocol migration:

  1. Lines 81-86: Commented-out hibernatable WebSocket acknowledgment queue declarations, but the queue is still used throughout the file (lines 339-349, 649, 656, 770)

  2. Lines 138-296: The entire getActorHibernationConfig callback is defined but this API has been replaced with canWebSocketHibernate and restoreHibernativeRequests in the new protocol. This is dead code.

  3. Lines 316-319, 339-349, 691, 840: References to #wsAckFlushInterval and #flushHibernatableWebSocketAcks() but these properties are not declared in the class (they're in commented code).

  4. Line 345: Calls this.#runner.sendWebsocketMessageAck(requestId, index) but the runner API has been renamed to sendHibernatableWebSocketMessageAck (as seen in mod.ts:1444-1447).

Fix: Either:

  • Remove all hibernatable WebSocket code from actor-driver.ts if it's handled elsewhere, or
  • Properly implement the new protocol with restoreHibernativeRequests and canWebSocketHibernate callbacks

Code Quality Issues

4. Loop Variable Declaration Issue in tunnel.ts (Line 116)

Severity: Low - Style

for (const i = 0; i < requestIds.length; i++) {

This should use let instead of const since i is being reassigned:

for (let i = 0; i < requestIds.length; i++) {

5. Missing Implementation in actor-driver.ts

Severity: High - Incomplete Implementation

The new RunnerConfig interface requires two new callbacks:

  • restoreHibernativeRequests (line 23-26 in mod.ts)
  • canWebSocketHibernate (line 28-32 in mod.ts)

These are NOT implemented in the EngineActorDriver constructor (lines 113-297). The engine runner config only defines the old getActorHibernationConfig callback.

Fix: Remove the old callback and implement the new ones:

canWebSocketHibernate: (actorId: string, requestId: ArrayBuffer, request: Request): boolean => {
    // Implementation
},
restoreHibernativeRequests: (actorId: string, requestIds: protocol.RequestId[]): number[] => {
    // Implementation
},

6. Inconsistent Error Handling

Severity: Medium

In websocket-tunnel-adapter.ts:141-156, when a message index is out of sequence, the code:

  1. Logs a warning
  2. Closes the WebSocket
  3. But then still returns true (indicating the message was handled)

This seems inconsistent. If the connection is being closed due to an invalid state, should the return value indicate an error?


Performance Considerations

7. U16 Wrapping Functions

The new wrapping comparison functions in utils.ts (lines 74-109) look correct and have comprehensive test coverage. Good work on the test suite!

8. Event Handler Timing

The refactored WebSocket handling in tunnel.ts:700-705 now calls config.websocket() before sending the open confirmation. This is better for error handling but note that any exceptions in the user's websocket handler will prevent the connection from opening.


Test Coverage

The addition of tests/utils.test.ts with comprehensive tests for the wrapping comparison functions is excellent. The tests cover edge cases well including:

  • Boundary conditions at u16 max
  • Wrapping behavior
  • Transitivity across wrap boundaries

Summary

This PR cannot be merged in its current state due to critical compilation and runtime errors:

  1. Fix the import error (wrappingLtewrappingLteU16)
  2. Fix the missing request parameter in restoreHibernatingRequests
  3. Complete the migration in actor-driver.ts by implementing the new callbacks and removing old code
  4. Fix the loop variable declaration

The conceptual changes look good - moving to a cleaner hibernation protocol with explicit restoration. Once the implementation is complete, this should be a solid improvement.

@claude
Copy link

claude bot commented Nov 13, 2025

Code Review: Hibernating WebSocket Protocol Implementation

Summary

This PR implements a new hibernating WebSocket protocol for RivetKit. The changes refactor how WebSocket connections are managed, adding support for hibernatable connections that can survive actor restarts by persisting and restoring connection state.

Positive Observations

Code Quality

  • Well-structured refactoring: The separation of concerns between Tunnel and WebSocketTunnelAdapter is clean
  • Excellent test coverage: The wrapping arithmetic utilities have comprehensive unit tests with edge cases
  • Good documentation: Comments explain complex logic like message index sequencing and hibernation behavior
  • Proper error handling: The code handles duplicate connections, out-of-sequence messages, and WebSocket lifecycle edge cases

Implementation Strengths

  • Defensive programming: Check for duplicate WebSocket opens (tunnel.ts:665-674)
  • Message ordering: Proper sequence validation with wrapping arithmetic for u16 indices
  • Lifecycle management: Clean separation between normal close and close-with-retry semantics

Issues & Recommendations

Critical Issues

1. Bug: Missing import in tunnel.ts

Location: tunnel.ts:7

import { wrappingLte } from "./utils";

Issue: The import references wrappingLte but utils.ts only exports wrappingLteU16. This will cause a runtime error.
Fix: Change to import { wrappingLteU16 as wrappingLte } from "./utils"; or use wrappingLteU16 directly

2. Bug: Unused imports in tunnel.ts ⚠️

Location: tunnel.ts:12-15

import {
	SET_CAN_HIBERNATE_SYMBOL,
	SET_REQUEST_SYMBOL,
	WebSocketTunnelAdapter,
} from "./websocket-tunnel-adapter";

Issue: SET_CAN_HIBERNATE_SYMBOL and SET_REQUEST_SYMBOL are imported but never used. These symbols don't appear to be exported from websocket-tunnel-adapter.ts
Fix: Remove unused imports

3. Bug: Missing request parameter in restoreHibernatingRequests

Location: tunnel.ts:120-126

this.#createWebSocket(
	actorId,
	requestId,
	idToStr(requestId),
	true,
	messageIndex,
);

Issue: #createWebSocket requires 6 parameters including request: Request (line 145), but only 5 are provided. This will cause a TypeScript error.
Fix: Need to restore the request metadata (path, headers) from hibernation state and pass a reconstructed Request object

4. Bug: Constructor signature mismatch

Location: tunnel.ts:148-187 calls WebSocketTunnelAdapter constructor
The constructor call doesn't match the signature in websocket-tunnel-adapter.ts:50-74

  • Constructor expects: tunnel, actorId, requestId, hibernatable, request, messageIndex, persistCallback, sendCallback, closeCallback
  • Constructor is called with: requestIdStr, hibernatable, request, sendCallback, closeCallback

Fix: Update the constructor call to match the new signature

Moderate Issues

5. Missing persistence callback implementation ⚠️

Location: tunnel.ts:148-187
Issue: The #createWebSocket method doesn't implement the persistHibernatableWebSocketMessageIndex callback that the WebSocketTunnelAdapter requires. This callback is critical for hibernation to work.
Fix: Add a callback that persists the message index via the runner config

6. Incomplete HibernatingWebSocketMetadata usage ⚠️

Location: tunnel.ts:29-33

export interface HibernatingWebSocketMetadata {
	path: string;
	headers: Record<string, string>;
	messageIndex: number;
}

Issue: This interface is defined but never used in the code. It should be used to restore WebSocket state in restoreHibernatingRequests.

7. API breaking change without migration path ⚠️

Location: mod.ts:60-66

websocket: (
	runner: Runner,
	actorId: string,
	ws: any,
	requestId: protocol.RequestId,
	request: Request,
) => void;  // Changed from Promise<void>

Issue: The websocket handler changed from async to sync without backward compatibility. This will break existing implementations.
Recommendation: Document this breaking change or provide a migration guide

8. Typo in method name ⚠️

Location: mod.ts:79

restoreHibernativeRequests(

Issue: Should be restoreHibernatingRequests (not "Hibernative")
Fix: Rename to use consistent terminology

Minor Issues

9. Inconsistent error message 📝

Location: tunnel.ts:538

this.log?.debug({ msg: "catught runner shutdown error" });

Issue: Typo "catught" should be "caught"

10. TODO comment needs resolution 📝

Location: tunnel.ts:192

// TODO: assign request to ws so we can pass this through

Issue: This TODO suggests incomplete implementation
Recommendation: Either implement or create a tracking issue

11. Missing validation 📝

Location: websocket-tunnel-adapter.ts:104-204
Issue: The _handleMessage method doesn't validate that messageIndex is within u16 range (0-65535)
Fix: Add validation: if (messageIndex < 0 || messageIndex > 65535) throw new Error(...)

12. Wrapping arithmetic edge case 📝

Location: websocket-tunnel-adapter.ts:124

if (wrappingLte(messageIndex, previousIndex)) {

Issue: Uses undefined wrappingLte - should use wrappingLteU16
Impact: Runtime error if this code path is reached

Performance Considerations

13. Potential memory leak 🔍

Location: websocket-tunnel-adapter.ts:243-318
Issue: The #fireEvent method doesn't appear to have limits on #eventListeners Map growth
Recommendation: Consider adding cleanup logic for removed WebSockets

14. Efficient buffer operations

The SharedArrayBuffer handling in websocket-tunnel-adapter.ts:436-476 is well-implemented with proper slicing to avoid security issues.

Security Considerations

15. Path validation

Location: tunnel.ts:828-830

if (!path.startsWith("/")) {
	throw new Error("path must start with leading slash");
}

Good validation to prevent path traversal issues.

16. Index boundary validation

Location: tunnel.ts:771-772

if (index < 0 || index > 65535)
	throw new Error("invalid websocket ack index");

Proper u16 range validation prevents overflow attacks.

Test Coverage

17. Excellent wrapping arithmetic tests

Location: tests/utils.test.ts

  • Comprehensive edge case testing (boundaries, wrapping, transitivity)
  • Tests cover the critical u16 wrapping logic
  • Good use of property-based testing concepts

18. Missing integration tests ⚠️

Issue: No tests for the actual hibernation flow (persist → restore → message delivery)
Recommendation: Add integration tests that:

  • Simulate actor restart with hibernating WebSocket
  • Verify message indices are restored correctly
  • Test duplicate message handling after restart

Architecture Concerns

19. Circular dependency risk 🔍

Location: tunnel.ts imports from websocket-tunnel-adapter.ts, which imports from tunnel.ts
Issue: WebSocketTunnelAdapter has #tunnel: Tunnel but Tunnel creates WebSocketTunnelAdapter
Impact: Potential initialization issues
Recommendation: Consider using dependency injection or extracting shared interfaces

20. Logging conventions 📝

Per CLAUDE.md, log messages should be lowercase unless mentioning code symbols. The code follows this correctly:

this.log?.info({ msg: "restored hibernatable requests", ... })

✅ Good adherence to conventions

Action Items

Must Fix Before Merge

  1. Fix wrappingLte import error (Issue [SVC-2555] Set up issue templates #1)
  2. Fix missing request parameter in restoreHibernatingRequests (Issue [SVC-2504] Fix 5 GB upload limit for local development from Cloudflare #3)
  3. Fix WebSocketTunnelAdapter constructor call (Issue [SVC-2483] Remove hardcoded uses of rivet.gg #4)
  4. Implement persistence callback (Issue [SVC-2358] Enable Redis memory overcommit #5)
  5. Fix typo in method name restoreHibernativeRequests (Issue Remove fallible uses of as in api services #8)

Should Fix

  1. Remove unused imports (Issue [SVC-2479] Send cluster events to PostHog #2)
  2. Implement or document HibernatingWebSocketMetadata usage (Issue Remove servers not in salt #6)
  3. Document breaking API change for websocket handler (Issue [SVC-2404] Improve internal documentation #7)
  4. Fix typo "catught" (Issue [SOC-15] Fix error when running user-delete #9)

Nice to Have

  1. Add integration tests for hibernation flow (Issue [SOC-21] Re-enable game-user-recommend #18)
  2. Resolve or track TODO comments (Issue [SVC-2479] Send cluster events to PostHog #10)
  3. Add message index validation in _handleMessage (Issue [SOC-15] Fix error when running user-delete #11)

Conclusion

This is a well-architected feature with strong test coverage for core utilities. However, there are several critical bugs that will prevent the code from compiling or running correctly. The main issues stem from incomplete refactoring - particularly around the WebSocketTunnelAdapter constructor signature changes and the missing request parameter restoration logic.

Once the critical issues are addressed, this will be a solid implementation of the hibernating WebSocket protocol. The defensive programming, error handling, and edge case coverage demonstrate high code quality.

Recommendation: Address critical issues before merge, particularly the constructor signature mismatches and missing request restoration logic.

@NathanFlurry NathanFlurry force-pushed the 11-13-chore_rivetkit_implement_new_hibernating_ws_protocol branch from 9d6c4c8 to 449ac28 Compare November 13, 2025 22:42
@NathanFlurry NathanFlurry force-pushed the 11-11-fix_guard_dont_send_open_msg_for_hibernating_ws_hibernating_ws_keepalive branch from e5ca471 to 5b5466a Compare November 13, 2025 22:42
@claude
Copy link

claude bot commented Nov 13, 2025

PR Review: Implement New Hibernating WebSocket Protocol

I've reviewed the changes and found several critical bugs that will prevent compilation, along with some recommendations for improvement.

Critical Bugs (Must Fix)

1. Import Name Mismatch in websocket-tunnel-adapter.ts

Location: engine/sdks/typescript/runner/src/websocket-tunnel-adapter.ts:7

import { wrappingLte } from "./utils";

Issue: The function is named wrappingLteU16 in utils.ts, not wrappingLte. This will cause a compilation error.

Fix: Change the import to:

import { wrappingLteU16 } from "./utils";

And update the usage on line 124:

if (wrappingLteU16(messageIndex, previousIndex)) {

2. Missing Request Parameter in tunnel.ts

Location: engine/sdks/typescript/runner/src/tunnel.ts:116-126

for (const i = 0; i < requestIds.length; i++) {
    const requestId = requestIds[i];
    const messageIndex = messageIndices[i];

    this.#createWebSocket(
        actorId,
        requestId,
        idToStr(requestId),
        true,
        messageIndex,
    );  // Missing 6th parameter: request
}

Issue: #createWebSocket requires 6 parameters (see line 139-145), but only 5 are provided. The request: Request parameter is missing.

Fix: You need to either:

  1. Store hibernating WebSocket metadata (path, headers) during hibernation and reconstruct the Request object when restoring, or
  2. Make the request parameter optional in #createWebSocket and handle the undefined case

3. Commented Out Field Declarations Still Being Used

Location: rivetkit-typescript/packages/rivetkit/src/drivers/engine/actor-driver.ts:81-86

// // WebSocket message acknowledgment debouncing for hibernatable websockets
// #hibernatableWebSocketAckQueue: Map<
// 	string,
// 	{ requestIdBuf: ArrayBuffer; messageIndex: number }
// > = new Map();
// #wsAckFlushInterval?: NodeJS.Timeout;

Issue: These fields are commented out but are being referenced throughout the file (lines 316, 339, 344, 348, 649, 653, 770, 834-836). This will cause compilation errors.

Fix: Uncomment these field declarations.

4. Invalid for Loop Declaration

Location: engine/sdks/typescript/runner/src/tunnel.ts:116

for (const i = 0; i < requestIds.length; i++) {

Issue: Using const in a traditional for loop is invalid - the loop variable must be mutable.

Fix: Change to:

for (let i = 0; i < requestIds.length; i++) {

5. Missing Implementation in actor-driver.ts

Location: rivetkit-typescript/packages/rivetkit/src/drivers/engine/actor-driver.ts

Issue: The PR changes RunnerConfig to require two new methods:

  • restoreHibernativeRequests(actorId: string, requestId: RequestId[]): number[]
  • canWebSocketHibernate(actorId: string, requestId: ArrayBuffer, request: Request): boolean

But the implementation in actor-driver.ts still uses the old API:

  • getActorHibernationConfig (lines 138-296)

Fix: The actor-driver.ts needs to be updated to implement the new API instead of the old one.

Code Quality Issues

6. Removed Return Statement

Location: engine/sdks/typescript/runner/src/tunnel.ts:193

// TODO: assign request to ws so we can pass this through

Issue: The #createWebSocket method is declared to return WebSocketTunnelAdapter but there's no return statement. This will cause compilation errors or undefined behavior.

Fix: Add a return statement:

// Store adapter
this.#actorWebSockets.set(requestIdStr, adapter);

return adapter;

7. WebSocket Constructor Signature Mismatch

Location: engine/sdks/typescript/runner/src/tunnel.ts:148-151

The old constructor signature appears to be:

new WebSocketTunnelAdapter(
    requestIdStr,
    hibernatable,
    request,
    // ... callbacks
)

But based on the new code in websocket-tunnel-adapter.ts (lines 50-74), the signature should be:

new WebSocketTunnelAdapter(
    tunnel,
    actorId,
    requestId,
    hibernatable,
    request,
    messageIndex,
    persistHibernatableWebSocketMessageIndex,
    sendCallback,
    closeCallback,
)

This needs to be verified and fixed.

Design Concerns

8. Message Index Persistence Strategy

Location: engine/sdks/typescript/runner/src/websocket-tunnel-adapter.ts:28

The #persistHibernatableWebSocketMessageIndex callback is defined but never called or passed to the constructor in the current implementation. This is a critical gap in the hibernation protocol - without persisting message indices, you can't properly restore WebSocket state.

9. Async to Sync Change Without Clear Justification

Location: engine/sdks/typescript/runner/src/mod.ts:60-66

websocket: (
    runner: Runner,
    actorId: string,
    ws: any,
    requestId: protocol.RequestId,
    request: Request,
) => void;  // Changed from Promise<void> to void

Concern: This breaking change from async to sync might cause issues if implementations need to perform async operations during WebSocket setup. Consider whether this is intentional and document the reasoning.

Positive Aspects

  1. Excellent test coverage for the wrapping arithmetic functions in utils.test.ts - comprehensive edge case testing including wrap boundaries and transitivity
  2. Good error handling for message index validation (lines 122-142 in websocket-tunnel-adapter.ts)
  3. Clear documentation in comments explaining the hibernation protocol
  4. Proper separation of concerns between Tunnel and WebSocketTunnelAdapter

Recommendations

  1. Run TypeScript compilation before submitting - these bugs would be caught immediately
  2. Consider adding integration tests for the hibernation/restoration flow
  3. Document the migration path from the old getActorHibernationConfig API to the new canWebSocketHibernate/restoreHibernativeRequests API
  4. Add JSDoc comments to the new public methods explaining the hibernation protocol

Summary

This PR implements an important feature but has several compilation-blocking bugs that need to be fixed before it can be merged. The core logic appears sound, but the implementation is incomplete in several areas.

Recommendation: Address the 5 critical bugs before merging. The code quality issues should also be fixed but are less critical.

@claude
Copy link

claude bot commented Nov 13, 2025

PR Review: chore(rivetkit): implement new hibernating ws protocol

Summary

This PR refactors the WebSocket tunnel implementation to support a new hibernating WebSocket (HWS) protocol. The changes affect the TypeScript runner SDK and RivetKit engine driver, implementing message sequencing, hibernation support, and improved lifecycle management.

Code Quality & Best Practices

✅ Strengths

  1. Clear separation of concerns: The refactoring properly separates WebSocket creation, message handling, and hibernation logic
  2. Better error handling: Improved handling of edge cases like duplicate opens, message index skipping, and out-of-sequence messages
  3. Type safety: Good use of TypeScript types and interfaces throughout
  4. Documentation: Added helpful JSDoc comments explaining complex behavior (e.g., restoreHibernativeRequests)

⚠️ Issues & Concerns

1. Typo in naming (Minor)

  • restoreHibernativeRequests should be restoreHibernatingRequests (line 78 in mod.ts)
  • Also appears in interface definition and implementation

2. Breaking API changes (Major)

The websocket handler signature changed from async to sync:

// Before
websocket?: (...) => Promise<void>;

// After  
websocket: (...) => void;

This is a breaking change that could cause issues for existing implementations. Consider:

  • Documenting this breaking change in release notes
  • Providing migration guidance for users
  • Or maintaining backward compatibility with a wrapper

3. Missing error handling in loop (Minor)

Line 116 in tunnel.ts:

for (const i = 0; i < requestIds.length; i++) {

This should use let instead of const for the loop variable.

4. Inconsistent function signature (Medium)

#createWebSocket is called with different numbers of parameters:

  • Line 118-125: Called with 5 params (during restore)
  • Line 691-698: Called with 6 params (during open)

The function definition expects 6 parameters, so the restore call is missing the request parameter. This would cause a runtime error.

Fix needed:

// Line 118-125 should build a request before calling createWebSocket
const request = buildRequestForWebSocket(???, ???); // Need path and headers
const adapter = this.#createWebSocket(
    actorId,
    requestId,
    requestIdStr,
    true,
    messageIndex,
    request, // Missing!
);

5. Removed validation (Medium)

The check for websocketHandler existence was removed from #handleWebSocketOpen. Now if config.websocket is undefined, it will throw at runtime instead of sending a graceful close message. Consider adding back the validation or documenting that websocket is now required.

6. Wrapping comparison function name mismatch (Critical)

Line 25 in websocket-tunnel-adapter.ts:

import { wrappingLte } from "./utils";

But utils.ts exports wrappingLteU16, not wrappingLte. This would cause an import error.

Fix needed:

import { wrappingLteU16 } from "./utils";
// And update usage on line 87:
if (wrappingLteU16(messageIndex, previousIndex)) {

7. Async function became sync (Medium)

#handleWebSocketMessage was changed from async to sync, but the message handling might still need to support async operations. Verify this won't cause issues with message processing.

Performance Considerations

✅ Good

  • Removed unnecessary async/await in message path (if intentional)
  • Better memory management with explicit cleanup of actor tracking

⚠️ Potential Issues

  • Message buffering removed: The previous event buffering system was removed entirely. This could cause events to be lost if listeners aren't attached immediately. Verify this is intentional and won't cause race conditions.

Security Concerns

✅ Good

  • Proper validation of message indices prevents replay attacks
  • Duplicate WebSocket detection prevents resource exhaustion
  • Close on out-of-sequence messages prevents state corruption

⚠️ Minor Concerns

  • Consider rate limiting on WebSocket open attempts per actor
  • The buildRequestForWebSocket creates a mock request - ensure this can't be exploited to bypass auth checks

Test Coverage

⚠️ Concerns

  • New test file added (utils.test.ts) for wrapping comparison functions - good!
  • No tests for the main changes: No new tests for hibernation logic, message sequencing, or the refactored WebSocket lifecycle
  • Missing edge case tests: Should test:
    • Message index wrapping at u16 boundary
    • Restoration with invalid message indices
    • Concurrent WebSocket open attempts
    • Hibernation/wake cycles

Additional Recommendations

  1. Add integration tests for the hibernating WebSocket protocol
  2. Document the protocol - add comments explaining the hibernation flow and message index semantics
  3. Error messages: Consider adding error codes to all close reasons for better debugging
  4. Metrics: Consider adding telemetry for hibernation events, message indices, and restore success/failure rates

Critical Fixes Required

  1. Fix import error: wrappingLtewrappingLteU16 (would break at runtime)
  2. Fix function call: Add missing request parameter to #createWebSocket in restoreHibernatingRequests
  3. Fix loop variable: Change const i to let i in for loop

Summary

This is a significant refactoring that improves the hibernating WebSocket implementation, but it has several critical issues that would prevent it from working correctly:

  • Blocking: Import error and missing function parameter
  • Breaking: API changes need documentation
  • Testing: Needs comprehensive test coverage

The architecture and approach look solid, but please address the critical issues before merging.


Review generated with Claude Code

@NathanFlurry NathanFlurry changed the base branch from 11-11-fix_guard_dont_send_open_msg_for_hibernating_ws_hibernating_ws_keepalive to graphite-base/3464 November 13, 2025 22:48
@NathanFlurry NathanFlurry force-pushed the 11-13-chore_rivetkit_implement_new_hibernating_ws_protocol branch from 449ac28 to 7fadc23 Compare November 13, 2025 22:49
@NathanFlurry NathanFlurry changed the base branch from graphite-base/3464 to main November 13, 2025 22:49
@claude
Copy link

claude bot commented Nov 13, 2025

Pull Request Review: Hibernating WebSocket Protocol Implementation

Overview

This PR implements a new hibernating WebSocket protocol for RivetKit, focusing on improving WebSocket state management and message sequencing for actors that can hibernate and wake up.

Code Quality & Best Practices

Strengths ✅

  1. Comprehensive test coverage - Excellent tests for wrapping arithmetic in utils.test.ts covering edge cases, boundary conditions, and transitivity
  2. Clear separation of concerns - The refactoring properly separates WebSocket lifecycle management from message handling
  3. Good documentation - Well-commented code explaining hibernation logic and edge cases
  4. Error handling - Proper handling of out-of-sequence messages and duplicate connections

Issues & Concerns ⚠️

Critical: Missing Import in websocket-tunnel-adapter.ts

import { wrappingLte } from "./utils";

The code imports wrappingLte but utils.ts only exports wrappingLteU16. This will cause a runtime error. Should be:

import { wrappingLteU16 as wrappingLte } from "./utils";
// or update the usage to call wrappingLteU16 directly

Bug: Incorrect Loop Syntax in tunnel.ts:116

for (const i = 0; i < requestIds.length; i++) {

Should be:

for (let i = 0; i < requestIds.length; i++) {

Using const in a for-loop with reassignment will fail.

Type Safety Issue in websocket-tunnel-adapter.ts:64

public messageIndex: number,

The messageIndex is public and mutable, but it's critical state for hibernation. Consider making it readonly and providing a controlled setter method to prevent external mutation.

Missing Implementation in actor-driver.ts

Lines 82-86 show commented-out code for #hibernatableWebSocketAckQueue, but the code at lines 316-319, 338-348, and elsewhere references this.#hibernatableWebSocketAckQueue which is not defined. This will cause runtime errors.

Lines 138-296 reference getActorHibernationConfig but the new API uses canWebSocketHibernate and restoreHibernativeRequests. The actor-driver.ts file wasn't fully updated to match the new protocol.

Incomplete Refactoring in tunnel.ts

Line 146 has missing parameter in #createWebSocket:

this.#createWebSocket(
    actorId,
    requestId,
    requestIdStr,
    true,
    messageIndex,
    // Missing 'request' parameter here!
);

But the method signature at line 145 requires a request: Request parameter.

Line 192 has a TODO comment that should be addressed:

// TODO: assign request to ws so we can pass this through

API Breaking Changes Without Migration Path

The websocket callback changed from async returning Promise<void> to synchronous returning void. This is a breaking change that could affect existing code. Consider:

  1. Documenting this breaking change
  2. Providing a migration guide
  3. Supporting both versions temporarily with deprecation warnings

The HibernationConfig interface changed:

  • Removed enabled field
  • Changed lastMsgIndex from number | undefined to number
    These changes need documentation.

Unused Exports in websocket-tunnel-adapter.ts

Lines 13-14 import symbols that don't appear to be defined:

import {
    SET_CAN_HIBERNATE_SYMBOL,
    SET_REQUEST_SYMBOL,
    WebSocketTunnelAdapter,
} from "./websocket-tunnel-adapter";

These symbols aren't exported from the file.

Performance Considerations

Positive Changes ✅

  1. Reduced buffering overhead - Removing the buffered events pattern (lines 104-318 in old websocket-tunnel-adapter.ts) eliminates memory overhead and simplifies the event flow
  2. Better message sequencing - Using wrapping arithmetic for u16 indices properly handles wraparound without expensive checks

Potential Issues ⚠️

  1. Message index persistence - Line 163 in websocket-tunnel-adapter.ts calls #persistHibernatableWebSocketMessageIndex on every message. Depending on the persistence implementation, this could be expensive. Consider batching or debouncing.
  2. Missing ack queue - The removal of #hibernatableWebSocketAckQueue in actor-driver.ts without a replacement could impact message acknowledgment performance.

Security Concerns

Minor Issues ⚠️

  1. Path validation - Line 828 in tunnel.ts validates path starts with "/", but doesn't validate for path traversal attacks or other malicious patterns
  2. Error information leakage - Line 614 in actor-driver.ts exposes the full error in close reason: wsContext.close(1011, \${err}`)`. Consider sanitizing error messages to avoid leaking sensitive information.

Test Coverage

Strengths ✅

The wrapping arithmetic tests are excellent with:

  • Edge case coverage (u16 boundaries)
  • Transitivity validation
  • Inverse relationship verification

Missing Coverage ⚠️

  1. No tests for WebSocket hibernation/restoration flow
  2. No tests for out-of-sequence message handling
  3. No tests for the new restoreHibernativeRequests API
  4. No integration tests for the new protocol

Recommendations

High Priority 🔴

  1. Fix the import error in websocket-tunnel-adapter.ts (line 7)
  2. Fix the loop syntax in tunnel.ts (line 116)
  3. Complete the refactoring in actor-driver.ts - remove or implement the hibernatable WebSocket ack queue
  4. Add the missing request parameter in tunnel.ts:146
  5. Document breaking changes in the PR description or CHANGELOG

Medium Priority 🟡

  1. Add integration tests for hibernation flow
  2. Consider adding a deprecation path for the async websocket callback
  3. Review and optimize message index persistence frequency
  4. Make messageIndex in WebSocketTunnelAdapter readonly with controlled mutation

Low Priority 🟢

  1. Clean up unused imports/exports
  2. Add more inline documentation for the hibernation protocol
  3. Consider adding metrics/telemetry for hibernation events

Summary

This is a substantial refactoring that improves the architecture of hibernating WebSocket handling. However, there are critical bugs that need to be fixed before merging:

  • Missing/incorrect imports
  • Syntax errors in loops
  • Incomplete refactoring in actor-driver.ts

Once these are addressed, the PR will significantly improve WebSocket reliability for hibernating actors.

@claude
Copy link

claude bot commented Nov 13, 2025

Pull Request Review: Hibernating WebSocket Protocol Implementation

Summary

This PR implements a new hibernating WebSocket protocol for RivetKit. The changes refactor the WebSocket handling to better support WebSocket hibernation with message index tracking, improved event handling, and proper sequence validation.

Critical Issues

1. Import Error in websocket-tunnel-adapter.ts

Location: engine/sdks/typescript/runner/src/websocket-tunnel-adapter.ts:6

import { wrappingLte } from "./utils";

Issue: The function is named wrappingLteU16 in utils.ts, not wrappingLte. This will cause a runtime error.

Fix: Change to:

import { wrappingLteU16 } from "./utils";

And update the usage on line ~752 from wrappingLte(messageIndex, previousIndex) to wrappingLteU16(messageIndex, previousIndex).

2. Missing Parameter in #createWebSocket Call

Location: engine/sdks/typescript/runner/src/tunnel.ts:120-126

The #createWebSocket method signature requires 6 parameters:

#createWebSocket(
  actorId: string,
  requestId: RequestId,
  requestIdStr: string,
  hibernatable: boolean,
  messageIndex: number,
  request: Request,  // <-- Missing in restoreHibernatingRequests call
): WebSocketTunnelAdapter

But the call in restoreHibernatingRequests only passes 5 parameters. You need to construct a Request object for hibernating WebSockets or refactor the signature to make request optional when restoring.

3. Type Signature Change Not Fully Implemented

Location: rivetkit-typescript/packages/rivetkit/src/drivers/engine/actor-driver.ts:561

The #runnerWebSocket method signature changed from async returning Promise<void> to synchronous returning void:

#runnerWebSocket(
  _runner: Runner,
  actorId: string,
  websocketRaw: any,
  requestIdBuf: ArrayBuffer,
  request: Request,
): Promise<void> {  // <-- Still Promise<void>, should be void

This doesn't match the updated RunnerConfig.websocket signature in mod.ts:60 which now returns void instead of Promise<void>. Either remove async and change return type to void, or reconcile the types.

Code Quality Issues

4. Commented-Out Code

Location: rivetkit-typescript/packages/rivetkit/src/drivers/engine/actor-driver.ts:81-85, 651-670

There are multiple blocks of commented-out code related to hibernatable WebSocket ack queue handling:

// #hibernatableWebSocketAckQueue: Map<
// 	string,
// 	{ requestIdBuf: ArrayBuffer; messageIndex: number }
// > = new Map();
// #wsAckFlushInterval?: NodeJS.Timeout;

And later:

if (currentEntry) {
  // TODO:
} else {
  // ...
}

Issue: This indicates incomplete migration. Either remove the commented code if it's no longer needed, or complete the implementation if these features are still required.

5. Missing Implementation Comments

Location: engine/sdks/typescript/runner/src/tunnel.ts:192

// TODO: assign request to ws so we can pass this through

This TODO suggests incomplete functionality. Please either implement this or create a tracking issue.

Performance Considerations

6. Wrapping Comparison Edge Case

Location: engine/sdks/typescript/runner/src/utils.ts:68-109

The wrapping comparison functions look correct for u16 values, and the comprehensive test coverage is excellent. However, verify that JavaScript number precision doesn't cause issues at the boundaries since JavaScript uses 64-bit floats.

7. Event Handler Registration Timing

Location: engine/sdks/typescript/runner/src/tunnel.ts:682-713

The refactored code now calls config.websocket() before sending the ToServerWebSocketOpen message and calling adapter._handleOpen(). This is good - it ensures event handlers are registered before the open event fires. However, if config.websocket() throws, the WebSocket is properly cleaned up. Good work on this pattern!

Security Concerns

8. WebSocket Message Sequence Validation

Location: engine/sdks/typescript/runner/src/tunnel.ts:741-796 (in websocket-tunnel-adapter.ts)

The duplicate and out-of-sequence message handling looks solid:

// Ignore duplicate old messages
if (wrappingLteU16(messageIndex, previousIndex)) {
  this.#log?.info({ /* ... */ });
  return true;
}

// Close message if skipped message in sequence
if (messageIndex !== previousIndex + 1) {
  // ... close with code 1008
}

This properly prevents message injection or replay attacks. Once the import issue (#1) is fixed, this should work correctly.

9. Path Validation for WebSocket Requests

Location: engine/sdks/typescript/runner/src/tunnel.ts:828

if (!path.startsWith("/")) {
  throw new Error("path must start with leading slash");
}

Good validation! Consider also validating against path traversal attempts (e.g., /../ sequences).

Best Practices

10. Consistent Logging

The structured logging throughout is excellent and follows the project's conventions (lowercase messages, structured fields). Example:

this.#log?.info({
  msg: "restored hibernatable requests",
  actorId,
  requestIds: requestIds.length,
});

11. Removed Event Buffering

The removal of the event buffering mechanism (#bufferedEvents and #flushBufferedEvents) is a significant change. The old code buffered events if no listeners were attached. Make sure this doesn't break scenarios where listeners are attached after events fire.

Recommendation: Add integration tests to verify WebSocket event ordering with the new synchronous handler pattern.

Test Coverage

12. Excellent Test Coverage for Wrapping Functions

The test file utils.test.ts has comprehensive coverage for the wrapping arithmetic functions, including:

  • Normal cases
  • Equality cases
  • Wraparound scenarios
  • Boundary conditions
  • Transitivity checks

This is exemplary test coverage.

13. Missing Integration Tests

While the unit tests are excellent, there don't appear to be integration tests for:

  • WebSocket hibernation/restoration flow
  • Message sequence validation during hibernation
  • Error handling when restoreHibernativeRequests returns wrong count

Consider adding these for the critical path.

OpenAPI Changes

14. Removed Gateway Endpoints

Location: rivetkit-openapi/openapi.json

The PR removes several /gateway/{actorId}/* endpoints including /health, /action/{action}, and /request/{path}. This is a significant API change.

Question: Is this intentional as part of the hibernating WebSocket protocol migration? If so, ensure:

  1. These endpoints are truly no longer needed
  2. Any existing clients are migrated
  3. This is documented in release notes

Summary

Blockers (must fix before merge):

  1. Fix import: wrappingLtewrappingLteU16
  2. Fix missing request parameter in restoreHibernatingRequests
  3. Reconcile #runnerWebSocket return type (void vs Promise<void>)

High Priority:
4. Complete or remove commented-out hibernatable WebSocket ack queue code
5. Add integration tests for hibernation flow
6. Clarify API endpoint removals

Nice to Have:
7. Complete TODO comments or file tracking issues
8. Add path traversal validation
9. Test event handler attachment timing

Overall, this is a well-structured refactor with excellent test coverage for the core wrapping arithmetic. The main issues are TypeScript errors that will prevent compilation and some incomplete migration artifacts. Once the blockers are addressed, this should significantly improve WebSocket hibernation reliability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants