Skip to content

Commit 5081dce

Browse files
committed
🤖 perf: optimize Ollama CI caching to <5s startup
Key improvements: - Combined binary, library, and model caching into single cache entry Previously: separate caches for binary and models Now: /usr/local/bin/ollama + /usr/local/lib/ollama + /usr/share/ollama - Fixed model cache path from ~/.ollama/models to /usr/share/ollama Models are stored in system ollama user's home, not runner's home - Separated installation from server startup Install step only runs on cache miss and includes model pull Startup step always runs but completes in <5s with cached models - Optimized readiness checks Install: 10s timeout, 0.5s polling (only on cache miss) Startup: 5s timeout, 0.2s polling (every run, with cache hit) - Added cache key based on workflow file hash Cache invalidates when workflow changes, ensuring fresh install if needed Expected timing: - First run (cache miss): ~60s (download + install + model pull) - Subsequent runs (cache hit): <5s (just server startup) - Cache size: ~13GB (gpt-oss:20b model) Testing: Verified locally that Ollama starts in <1s with cached models
1 parent 4cd2491 commit 5081dce

File tree

1 file changed

+34
-23
lines changed

1 file changed

+34
-23
lines changed

.github/workflows/ci.yml

Lines changed: 34 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -99,39 +99,50 @@ jobs:
9999

100100
- uses: ./.github/actions/setup-cmux
101101

102-
- name: Cache Ollama binary
103-
id: cache-ollama-binary
102+
- name: Cache Ollama installation
103+
id: cache-ollama
104104
uses: actions/cache@v4
105105
with:
106-
path: /usr/local/bin/ollama
107-
key: ${{ runner.os }}-ollama-binary-v1
106+
path: |
107+
/usr/local/bin/ollama
108+
/usr/local/lib/ollama
109+
/usr/share/ollama
110+
key: ${{ runner.os }}-ollama-complete-v2-${{ hashFiles('.github/workflows/ci.yml') }}
108111
restore-keys: |
109-
${{ runner.os }}-ollama-binary-
110-
111-
- name: Cache Ollama models
112-
id: cache-ollama-models
113-
uses: actions/cache@v4
114-
with:
115-
path: ~/.ollama/models
116-
key: ${{ runner.os }}-ollama-models-v1
117-
restore-keys: |
118-
${{ runner.os }}-ollama-models-
112+
${{ runner.os }}-ollama-complete-v2-
119113
120114
- name: Install Ollama
121-
if: steps.cache-ollama-binary.outputs.cache-hit != 'true'
115+
if: steps.cache-ollama.outputs.cache-hit != 'true'
122116
run: |
117+
echo "Cache miss - installing Ollama and pulling model..."
123118
curl -fsSL https://ollama.com/install.sh | sh
124-
125-
- name: Start Ollama and pull models
126-
run: |
119+
127120
# Start Ollama service in background
128121
ollama serve &
129-
# Wait for Ollama to be ready
130-
timeout 30 sh -c 'until curl -s http://localhost:11434/api/tags > /dev/null 2>&1; do sleep 1; done'
131-
echo "Ollama is ready"
132-
# Pull the gpt-oss:20b model for tests (cached after first run)
122+
OLLAMA_PID=$!
123+
124+
# Wait for Ollama to be ready (fast check with shorter timeout)
125+
timeout 10 sh -c 'until curl -sf http://localhost:11434/api/tags > /dev/null 2>&1; do sleep 0.5; done' || {
126+
echo "Ollama failed to start"
127+
exit 1
128+
}
129+
130+
echo "Ollama started, pulling gpt-oss:20b model..."
133131
ollama pull gpt-oss:20b
134-
echo "Model pulled successfully"
132+
133+
# Stop Ollama to complete installation
134+
kill $OLLAMA_PID 2>/dev/null || true
135+
wait $OLLAMA_PID 2>/dev/null || true
136+
137+
echo "Ollama installation and model pull complete"
138+
139+
- name: Start Ollama server
140+
run: |
141+
echo "Starting Ollama server (models cached: ${{ steps.cache-ollama.outputs.cache-hit }})"
142+
ollama serve &
143+
# Fast readiness check - model is already cached
144+
timeout 5 sh -c 'until curl -sf http://localhost:11434/api/tags > /dev/null 2>&1; do sleep 0.2; done'
145+
echo "Ollama ready in under 5s"
135146
136147
- name: Build worker files
137148
run: make build-main

0 commit comments

Comments
 (0)