Skip to content

Conversation

@lmcdonough
Copy link
Collaborator

🚀 Add Automated PR Preview Deployments

Implements isolated staging environments for every pull request.

What This Does

  • Auto-deploys backend on PR open/update to RPi5 via Tailscale
  • Isolated per PR: Separate containers, networks, volumes, ports
  • Bot comments on PR with clickable URLs and testing instructions
  • Auto-cleanup when PR closes/merges

Architecture

  • Backend port: 4000 + PR_NUMBER
  • Postgres port: 5432 + PR_NUMBER
  • Docker namespace: pr-{number} (complete isolation)
  • Images tagged: ghcr.io/owner/repo:pr-{number}

Files Added

  • .github/workflows/deploy-pr-preview.yml - Deployment automation
  • .github/workflows/cleanup-pr-preview.yml - Cleanup automation
  • docker-compose.pr-preview.yaml - Multi-tenant template
  • docs/PR-PREVIEW.md - Usage documentation

Access

  1. Connect to Tailscale
  2. Check PR comment for URLs
  3. Test your changes in isolation

…ployment

- Update workflow dispatch comments for clarity
- Remove unused frontend branch input from workflow
- Enhance concurrency comments for better understanding
- Adjust permissions for GitHub Container Registry
- Improve deployment context calculations and logging
- Update Docker Compose configuration for better readability and organization
- Ensure health checks and environment variables are clearly defined
- Streamline deployment steps and comments for clarity
@lmcdonough lmcdonough self-assigned this Oct 23, 2025
@lmcdonough lmcdonough added the feature work Specifically implementing a new feature label Oct 23, 2025
@lmcdonough lmcdonough added the infrastructure DevOps related label Oct 23, 2025
- Implement multi-layer cache fallback chain (PR → branch → main)
- Scope cache writes to PR-specific namespace for better isolation
- Expected improvements:
  - First PR build: 6-9 min (down from 10-15 min)
  - Subsequent PR builds: 2-5 min (down from 10-15 min)
- Uses scoped GitHub Actions cache to reuse compiled dependencies
- Maintains conditional force rebuild capability

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Builds ARM64 image from main branch daily at 2 AM UTC
- Runs after main branch changes to dependencies or code
- Reuses previous cache to minimize rebuild time (2-4 min typical)
- Writes to scope=main for PR preview workflows to utilize
- Expected impact: Reduces first PR build from 20-25 min to 6-9 min

Benefits:
- 60-70% faster first-time PR preview builds
- Automatic cache refresh keeps dependencies up-to-date
- Minimal GitHub Actions cost (~100-150 min/month)
- Net savings: 150-200 min/month for active development

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

This comment was marked as outdated.

…ng strategies, and deployment steps for ARM64 images on RPi5
lmcdonough and others added 5 commits October 30, 2025 20:37
…s and optimizing variable usage for clarity and maintainability.
…ing cache keys, and enhancing deployment steps for RPi5.
…on steps

The dtolnay/rust-toolchain action requires an explicit 'toolchain' input parameter.
Added 'toolchain: stable' to all Rust toolchain installation steps in both
deploy-pr-preview.yml and build-test-push.yml workflows to resolve the
"'toolchain' is a required input" error that was causing lint jobs to fail.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Fixes three critical issues in the deploy-to-rpi5 job:

1. **Heredoc variable expansion**: Changed from quoted heredoc ('ENDSSH')
   to passing variables via SSH environment. The quoted heredoc prevented
   GitHub Actions variables from being expanded, causing all variables to
   be empty on the remote server.

2. **PROJECT_NAME availability**: Now explicitly passed as an environment
   variable to the SSH session. Previously undefined in the remote context,
   causing docker compose commands to fail.

3. **Error handling**: Changed from 'set -e' to 'set -eo pipefail' to
   properly catch errors in piped commands (like docker login). The previous
   setting would not catch failures in the left side of pipes.

Technical changes:
- Pass all variables via SSH command prefix instead of heredoc exports
- Use ${VAR} syntax throughout heredoc for consistency
- Add GITHUB_TOKEN, GITHUB_ACTOR, RPI5_USERNAME, and SERVICE_STARTUP_WAIT_SECONDS
  to SSH environment variables

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Replace manual Default implementation with derive attribute as suggested
by clippy::derivable_impls lint. The manual implementation was simply
returning Self::InProgress, which can be expressed more idiomatically
using #[derive(Default)] with #[default] on the InProgress variant.

Changes:
- Add Default to derive macro for Status enum
- Add #[default] attribute to InProgress variant
- Remove manual impl std::default::Default for Status

This resolves the clippy error that was failing the Lint & Format job
in CI with -D warnings enabled.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This empty commit tests the fixed PR preview workflow to verify:
- deploy-to-rpi5 job now runs (no longer skipped)
- Full stack deploys to neo (postgres + backend + frontend)
- PR comment posts with access URLs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Change all secrets from required: true to required: false in the reusable
workflow. This is necessary because:

1. Secrets are resolved from the pr-preview environment at job execution time
2. When calling from another repo (frontend → backend), GitHub requires
   secrets marked as required: true to be passed from the caller
3. The frontend repo doesn't have these secrets - they're centralized in
   the backend repo's pr-preview environment
4. Setting required: false allows cross-repo calls to succeed while secrets
   are still available from the environment when jobs execute

This maintains the centralized secrets approach while enabling both same-repo
(backend PR) and cross-repo (frontend PR) workflow calls to succeed.

Fixes: Frontend workflow error "Secret X is required, but not provided"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Root cause: The build-arm64-image job uses 'if: always() && ...' which
causes it to have a non-standard result status. When deploy-to-rpi5
depends on it with just 'needs: build-arm64-image', the default behavior
is to only run if the needed job has a simple 'success' status. Jobs using
always() don't match this, so the deploy job was being skipped.

Solution: Add the same always() pattern to deploy-to-rpi5:
  if: |
    always() &&
    !cancelled() &&
    needs.build-arm64-image.result == 'success'

This explicitly checks the build result and runs whenever the build
succeeds, regardless of the build job's conditional execution pattern.

This is a minimal change that follows the existing pattern used for
build-arm64-image and ensures deploy always runs when build succeeds.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Root cause: The deploy job was checking out ref: main to get the
docker-compose.pr-preview.yaml file, but this file doesn't exist on main
yet - it only exists in the PR branch.

Solution: Change the checkout ref from hardcoded 'main' to use the
backend_branch output from build-arm64-image job:
  ref: ${{ needs.build-arm64-image.outputs.backend_branch }}

This ensures:
- Backend PRs: Uses the PR branch (where compose file exists)
- Frontend PRs: Uses main branch (where compose file will exist after merge)

Minimal change that follows existing pattern of using job outputs.

Fixes: "scp: stat local backend-compose/docker-compose.pr-preview.yaml:
No such file or directory"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Root cause: The schema preparation step was missing two environment
variables that docker-compose.pr-preview.yaml requires:
- PR_FRONTEND_CONTAINER_PORT
- FRONTEND_IMAGE

This caused docker compose to fail with "invalid proto:" error because
the variables defaulted to blank strings.

Solution: Add the missing variables to the schema preparation environment
file to match the deploy step's environment file. Both steps now have
identical variable sets.

Changes:
- Line 817: Added FRONTEND_IMAGE from build outputs
- Line 823: Added PR_FRONTEND_CONTAINER_PORT from ports outputs

Minimal change following existing pattern. Frontend workflow already has
these variables in its deploy step, so no changes needed there.

Fixes: "The PR_FRONTEND_CONTAINER_PORT variable is not set" and
"invalid proto:" errors

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Root cause: The backend service expects uppercase log level values
(OFF, ERROR, WARN, INFO, DEBUG, TRACE) but the workflow was passing
lowercase 'info', causing the backend to fail at startup with:
  error: invalid value 'info' for '--log-level-filter <LOG_LEVEL_FILTER>'

Solution: Changed BACKEND_LOG_FILTER_LEVEL from 'info' to 'INFO' in
both environment file creation locations (schema prep and deploy steps).

Changes:
- Line 832: info -> INFO (schema preparation)
- Line 951: info -> INFO (deployment)

Minimal change - only case correction, no logic changes.

Fixes: Backend startup failure due to invalid log level argument

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions
Copy link

github-actions bot commented Nov 7, 2025

🚀 PR Preview Environment Deployed!

🔗 Access URLs

Service URL
Frontend http://neo.rove-barbel.ts.net:3201
Backend API http://neo.rove-barbel.ts.net:4201
Health Check http://neo.rove-barbel.ts.net:4201/health

📊 Environment Details

🔐 Access Requirements

  1. Connect to Tailscale (required)
  2. Access frontend: http://neo.rove-barbel.ts.net:3201
  3. Access backend: http://neo.rove-barbel.ts.net:4201

🧪 Testing

# Health check
curl http://neo.rove-barbel.ts.net:4201/health

# API test  
curl http://neo.rove-barbel.ts.net:4201/api/v1/users

🧹 Cleanup

Environment auto-cleaned when PR closes/merges


Deployed: 2025-11-07T18:20:56.092Z
Architecture: Native ARM64 build on Neo + Multi-tier caching

@jhodapp jhodapp added this to the 1.0.0-beta2 milestone Nov 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature work Specifically implementing a new feature infrastructure DevOps related

Projects

Status: 🏗 In progress

Development

Successfully merging this pull request may close these issues.

Add a staging environment for previewing and testing ahead of a new deployment

3 participants