Skip to content

Conversation

@abcxff
Copy link
Contributor

@abcxff abcxff commented Nov 8, 2025

Fixes RVT-5269

@vercel
Copy link

vercel bot commented Nov 8, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
rivet-site Ready Ready Preview Comment Nov 11, 2025 11:44pm
rivetkit-serverless Ready Ready Preview Comment Nov 11, 2025 11:44pm
2 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
rivet-cloud Ignored Ignored Nov 11, 2025 11:44pm
rivet-inspector Ignored Ignored Preview Nov 11, 2025 11:44pm

Copy link
Contributor Author

abcxff commented Nov 8, 2025


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@abcxff abcxff changed the title feat: pegboard serverless pool feat(pegboard): move pegboard serverless to wf Nov 8, 2025
@abcxff abcxff requested a review from MasterPtato November 8, 2025 01:25
@abcxff abcxff marked this pull request as ready for review November 8, 2025 01:25
@linear
Copy link

linear bot commented Nov 8, 2025

@pkg-pr-new
Copy link

pkg-pr-new bot commented Nov 8, 2025

More templates

@rivetkit/actor

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/actor@3416

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@3416

@rivetkit/core

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/core@3416

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@3416

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@3416

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@3416

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@3416

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@3416

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@3416

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@3416

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@3416

commit: be0a1d2

@claude
Copy link

claude bot commented Nov 8, 2025

PR Review: Move Pegboard Serverless to Workflow

Summary

This PR migrates pegboard serverless functionality from a separate package into the main pegboard package as workflows, and removes the unused internal package. The serverless runner lifecycle is now managed via the gasoline workflow engine.

Critical Issues

1. Corrupted Error Artifact Generation (BLOCKER)

The PR includes 88 malformed error artifact files with corrupted names containing autocomplete markers like raCompletionMarker and partial error codes.

Examples:

  • serverless_runner.invalid_configraCompletionMarker.json
  • seraCompletionMarkerrunner_config.invalid.json
  • serverless_runner.cnot_found.json

Root Cause: The error definitions in errors.rs are correct, but the artifact generation script has a bug creating incremental artifacts.

Action Required:

  • Delete all 86 malformed error artifact files
  • Fix the error artifact generation script
  • Regenerate only valid error artifacts

Code Quality Issues

2. Unnecessary field name in pool.rs:35
Line has redundant field initialization: config: config
Should be: config

3. Debug comment in runtime.rs:1
Leftover comment: runner wf see how signal fail handling

4. Activity Retry Strategy (runner.rs:115-116)
Using MAX values for timeout and retries is concerning. Infinite retries could cause zombie workflows. Is this intentional? Please add a comment explaining the rationale or set reasonable limits.

5. TODO about lifecycle state (runner.rs:229-234)
Known issue about lifecycle state management when connections close unexpectedly. Consider creating a GitHub issue to track this.

Positive Changes

Architecture Simplification

  • Consolidates serverless into main pegboard package
  • Reduces workspace complexity (removed 2 packages)
  • Improves code cohesion

Workflow-Based State Management

  • Provides durable state persistence
  • Better observability
  • Automatic retry and recovery

Security

  • Headers properly validated preventing injection attacks
  • Tokens handled securely

ActivityCtx Enhancements
Added useful methods: find_workflow, get_workflows, state accessor, update_workflow_tags

Test Coverage

No test files found for new serverless workflows. Consider adding:

  • Unit tests for serverless configuration parsing
  • Integration tests for pool scaling logic
  • Tests for graceful draining behavior

Recommendations

  1. CRITICAL: Fix error artifact generation bug
  2. HIGH: Document or fix infinite timeout/retry strategy
  3. MEDIUM: Add tests for serverless workflows
  4. LOW: Clean up debug comment and use shortened field init

Final Verdict

NEEDS CHANGES - Good architecture and code quality, but corrupted error artifacts must be fixed before merging. Once resolved, this is a solid refactoring.

@MasterPtato MasterPtato force-pushed the 11-05-feat_pegboard_serverless_pool branch from ddb0414 to be0a1d2 Compare November 11, 2025 23:41
Comment on lines +532 to 538
ctx.v(2)
.signal(crate::workflows::serverless::pool::Bump {})
.to_workflow::<crate::workflows::serverless::pool::Workflow>()
.tag("namespace_id", input.namespace_id)
.tag("runner_name", input.runner_name_selector.clone())
.send()
.await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to handle WorkflowNotFound gracefully

Comment on lines +55 to 61
ctx.v(2)
.signal(crate::workflows::serverless::pool::Bump {})
.to_workflow::<crate::workflows::serverless::pool::Workflow>()
.tag("namespace_id", input.namespace_id)
.tag("runner_name", res.runner_name_selector.clone())
.send()
.await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to handle WorkflowNotFound gracefully

Comment on lines +619 to +627
if allocate_res.serverless {
ctx.v(2)
.signal(crate::workflows::serverless::pool::Bump {})
.to_workflow::<crate::workflows::serverless::pool::Workflow>()
.tag("namespace_id", input.namespace_id)
.tag("runner_name", input.runner_name_selector.clone())
.send()
.await?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to handle WorkflowNotFound gracefully

Comment on lines +576 to +584
if allocate_res.serverless {
ctx.v(2)
.signal(crate::workflows::serverless::pool::Bump {})
.to_workflow::<crate::workflows::serverless::pool::Workflow>()
.tag("namespace_id", input.namespace_id)
.tag("runner_name", input.runner_name_selector.clone())
.send()
.await?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to handle WorkflowNotFound gracefully

Comment on lines +20 to +25
// Service::new(
// "pegboard_serverless",
// // There should only be one of these, since it's auto-scaling requests
// ServiceKind::Singleton,
// |config, pools| Box::pin(pegboard_serverless::start(config, pools)),
// ),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comment

Comment on lines +48 to 53
ctx.signal(pegboard::workflows::serverless::pool::BumpConfig {})
.to_workflow::<pegboard::workflows::serverless::pool::Workflow>()
.tag("runner_name", body.runner_name)
.tag("namespace_id", body.namespace_id)
.send()
.await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to handle WorkflowNotFound gracefully

Comment on lines +100 to +101
#[timeout = u64::MAX]
#[max_retries = usize::MAX]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wrong about using this for outbound req retries, the issue is the backoff will never reset and will get longer and longer after each request.

Ideally we only have an increasing backoff for actual errors and not the request stopping normally or failing after drain.

This will require a loop with state tracking last retry and retry count and a resetting mechanism (see pb actor reschedule_actor fn for reference)

Tracking with https://linear.app/rivet-gg/issue/RVT-5274/add-backoff-to-pb-serverless

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants