Skip to content

Conversation

@deanq
Copy link
Member

@deanq deanq commented Nov 10, 2025

Implement lazy-loading for heavy dependencies in serverless modules to dramatically reduce cold start time. This optimization defers loading boto3 (~50MB) and FastAPI/Uvicorn/Pydantic stack (~300ms) until they're actually needed.

Performance Impact

Serverless Worker Cold Start

  • Before: ~480ms cold start, 841 modules loaded
  • After: ~280ms cold start, 640 modules loaded
  • Improvement: 42% faster, 200 fewer modules

Production Impact

  • Workers not using S3: Never load boto3 (~50MB saved)
  • Workers not in dev mode: Never load FastAPI/Uvicorn/Pydantic stack
  • Faster initial response for all serverless endpoints

Changes

1. boto3 Lazy-Loading (rp_upload.py)

  • Extract boto3 imports into _import_boto3_dependencies() helper function
  • Load boto3/botocore only when S3 upload functions are actually called
  • Use TYPE_CHECKING to restore proper type hints without runtime import cost
  • Return type: Tuple[Optional["BaseClient"], Optional["TransferConfig"]]
  • Consolidate local fallback logic into _save_to_local_fallback() helper

2. FastAPI Lazy-Loading (serverless/init.py)

  • Remove eager import of rp_fastapi from module level
  • Import FastAPI/Uvicorn/Pydantic on-demand in start() function
  • Only loads when --rp_serve_api flag is set or realtime mode is enabled
  • Production serverless workers never load the FastAPI stack

3. Error Handling & Fallbacks

  • Graceful ImportError handling with helpful error messages
  • Local file storage fallback when boto3 is unavailable
  • Replace print() with logger.warning() in fallback paths
  • Maintains full backward compatibility

4. Code Quality Improvements

  • Extract duplicated fallback logic into shared helper function
  • Consistent error handling across all upload functions
  • Better separation of concerns with helper functions

Testing

  • ✅ All 362 tests pass
  • ✅ Coverage: 96.76% (exceeds 90% requirement)
  • ✅ Updated test mocks for lazy-loaded modules
  • ✅ Verified on-demand loading works correctly
  • ✅ No breaking changes

Dependencies Optimized

  • boto3 + botocore - S3 client (~50MB) - Only loads during S3 upload operations
  • FastAPI + Uvicorn + Pydantic - API server stack (~300ms) - Only loads in dev/realtime mode

Backward Compatibility

All upload functions maintain full backward compatibility with graceful fallback to local file storage when dependencies are unavailable. No changes required to existing code using these functions.

Move boto3 imports from module level to function level in rp_upload.py.
This defers loading ~50MB of boto3/botocore dependencies until S3 upload
functions are actually called, improving initial import time and memory
footprint for users who don't use S3 features.

Changes:
- Refactored get_boto_client() and bucket_upload() with lazy imports
- Added ImportError handling with helpful error messages
- Updated tests to mock boto3 modules directly
- Enhanced documentation to explain lazy-loading behavior

All upload functions maintain backward compatibility and graceful
fallback to local file storage when boto3 is unavailable.
@deanq deanq requested a review from Copilot November 10, 2025 01:10
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes serverless cold start performance by deferring boto3 imports until S3 upload functions are actually invoked, reducing initial module load time and memory usage for applications that don't utilize S3 features.

Key Changes:

  • Moved boto3/botocore imports from module-level to function-level with try/except handling
  • Updated return type annotations from specific boto3 types to generic Any types
  • Added graceful degradation with local file storage fallback when boto3 is unavailable

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
runpod/serverless/utils/rp_upload.py Refactored get_boto_client(), bucket_upload(), and upload_in_memory_object() to use lazy-loaded boto3 imports with error handling
tests/test_serverless/test_utils/test_upload.py Updated mock paths from module-level to boto3-level to reflect lazy-loading changes
docs/serverless/utils/rp_upload.md Added Requirements section documenting boto3 dependency and lazy-loading behavior

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Address Copilot review comments from PR #466:

1. Extract boto3 import logic into shared helper function
   - Created _import_boto3_dependencies() to reduce duplication
   - Used by both get_boto_client() and bucket_upload()
   - Consistent error handling across all S3 upload functions

2. Add TransferConfig import to bucket_upload()
   - Now imports all boto3 dependencies via shared helper
   - Maintains consistency with get_boto_client()

3. Clarify documentation about fallback directories
   - Documented that upload_image() uses simulated_uploaded/
   - Documented that public API functions use local_upload/
   - Added context about when fallback behavior occurs

All 358 tests pass with 97% coverage.
@deanq deanq requested a review from Copilot November 10, 2025 01:17
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…paths

Address Copilot review comment from PR #466:

Replace print() statements with logger.warning() for consistency with
the module's logging setup. This allows proper log level control and
maintains consistent logging behavior throughout the module.

Changes:
- upload_image(): Use logger.warning() instead of print()
- upload_file_to_bucket(): Use logger.warning() instead of print()
- upload_in_memory_object(): Use logger.warning() instead of print()

All fallback messages now use structured logging with single-line
format for better log parsing and filtering.
@deanq deanq requested a review from Copilot November 10, 2025 01:27
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Address Copilot review comment from PR #466:

Extract duplicated fallback logic from upload_file_to_bucket() and
upload_in_memory_object() into a shared _save_to_local_fallback() helper
function to reduce code duplication and ensure consistent behavior.

Changes:
- Created _save_to_local_fallback() helper function
- Handles both file-based (source_path) and in-memory (file_data) uploads
- Consolidated logging, directory creation, and file saving logic
- upload_file_to_bucket() now calls helper with source_path parameter
- upload_in_memory_object() now calls helper with file_data parameter

Test improvements:
- Added test_upload_file_to_bucket_fallback() for file-based fallback
- Added test_upload_in_memory_object_fallback() for in-memory fallback
- Added test_save_to_local_fallback_invalid_args() for error handling
- Added test_import_boto3_dependencies_missing() for ImportError path
- Achieved 100% test coverage for rp_upload.py module

Benefits:
- Reduced code duplication (removed 12 lines of duplicate code)
- Single source of truth for fallback behavior
- Easier to maintain and test
- Consistent error messages and logging
- Complete test coverage ensures reliability

All 10 upload tests pass with 100% module coverage.
@deanq deanq requested a review from Copilot November 10, 2025 05:01
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…nctions

Address latest Copilot review comment from PR #466:

Eliminate remaining duplication by making upload_image() use the
_save_to_local_fallback() helper function. Added a 'directory' parameter
to the helper to support different fallback directories.

Changes:
- Added 'directory' parameter to _save_to_local_fallback() (default: 'local_upload')
- Updated upload_image() to use helper with directory='simulated_uploaded'
- Removed duplicate warning message and URL from upload_image()
- Consolidated all fallback logic into single helper function

Benefits:
- Complete elimination of code duplication
- Single source of truth for all fallback behavior
- Consistent warning messages across all upload functions
- Easier to maintain and update fallback logic

All 362 tests pass with 97% overall coverage, 100% coverage on rp_upload.py.
Use TYPE_CHECKING to import boto3 types only during static type checking,
maintaining proper type safety without runtime import cost.

Changes:
- Import BaseClient and TransferConfig under TYPE_CHECKING guard
- Restore get_boto_client() return type from Tuple[Any, Any] to
  Tuple[Optional[BaseClient], Optional[TransferConfig]]
- Remove # pragma: no cover comment as it's no longer needed

This addresses PR review feedback about maintaining type safety while
preserving the lazy-loading optimization.
Move FastAPI/Uvicorn/Pydantic imports from module-level to conditional
blocks where they're actually used. This stack is only needed when
--rp_serve_api flag is set (local dev) or realtime mode is enabled.

Performance Impact:
- Cold start: 480ms → 280-326ms (32-42% faster)
- Modules loaded: 841 → 640 (24% reduction, ~200 fewer)
- Production workers: Never load FastAPI/Uvicorn/Pydantic stack
- Dev mode: FastAPI loads on-demand when needed

Changes:
- Remove eager import of rp_fastapi from module level
- Add lazy import in start() when rp_serve_api flag is True
- Add lazy import in start() when realtime mode is enabled

All tests pass. No breaking changes.
Update test mocks to use correct import path for lazy-loaded rp_fastapi module.
Since FastAPI is now imported on-demand inside start() function rather than
at module level, tests need to mock the actual module path.

Changes:
- Update test_local_api to mock runpod.serverless.modules.rp_fastapi.WorkerAPI
- Update test_start_does_not_set_excepthook to mock correct module path

All 362 tests pass with 96.76% coverage.
@deanq deanq requested a review from Copilot November 11, 2025 04:37
@deanq deanq changed the title perf(serverless): lazy-load boto3 to reduce cold start time perf(serverless): lazy-load boto3, fastapi, and pydantic to reduce cold start time Nov 11, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (1)

runpod/serverless/utils/rp_upload.py:1

  • Corrected f-string formatting: removed unnecessary f-prefix from non-interpolated string.
""" PodWorker | modules | upload.py """

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@deanq deanq requested review from Copilot and jhcipar November 11, 2025 04:46
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@deanq deanq requested a review from Yhlong00 November 11, 2025 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants