Skip to content

Commit 6d91eab

Browse files
benthecarmanclaude
andcommitted
Fix bitcoind shutdown hang during initial sync errors and polling
When using bitcoind chain source, `Node::stop()` could hang indefinitely if called during: 1. Initial sync error backoff (up to 5 minutes of sleep) 2. Active polling operations in the continuous sync loop This is a more severe variant of the background sync hang issue, as the bitcoind initial sync loop had NO stop signal checking at all. ## Root Cause ### Initial Sync Loop The `BitcoindChainSource::continuously_sync_wallets()` initial sync loop (lines 149-277) performs synchronization on startup. When sync fails (e.g., "Connection refused" when bitcoind is unavailable), it enters an exponential backoff retry loop: - Transient errors: Sleep for `backoff` seconds (10s, 20s, 40s, 80s, 160s, up to 300s) - Persistent errors: Sleep for MAX_BACKOFF_SECS (300 seconds) These sleeps had NO stop signal checking. When shutdown was requested: 1. Initial sync fails with connection error 2. Code sleeps for backoff period (up to 5 minutes) 3. User calls `stop()` 4. Stop signal sent but ignored - stuck sleeping 5. Shutdown times out after 5 seconds and aborts 6. Initial sync loop never exits cleanly ### Continuous Polling Loop The continuous polling loop (lines 296-349) had the same issue as electrum/esplora - no biased select and no cancellation of in-progress operations. ## Solution ### Initial Sync Loop Fix 1. Added biased `tokio::select!` at loop start to check stop signal before beginning each sync attempt 2. Wrapped both error backoff sleeps in biased `tokio::select!` to allow immediate interruption when stop is requested This ensures that even if bitcoind is down and the node is retrying with exponential backoff, shutdown completes immediately instead of waiting up to 5 minutes. ### Continuous Polling Loop Fix Applied the same biased nested `tokio::select!` pattern used for electrum/esplora: - Outer biased select prioritizes stop signal before polling intervals - Inner nested selects allow cancellation of in-progress operations ## Testing Existing unit tests pass. The shutdown will now complete in milliseconds instead of hanging for minutes when bitcoind is unreachable. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 386eb74 commit 6d91eab

File tree

1 file changed

+74
-11
lines changed

1 file changed

+74
-11
lines changed

src/chain/bitcoind.rs

Lines changed: 74 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -147,6 +147,19 @@ impl BitcoindChainSource {
147147
const MAX_BACKOFF_SECS: u64 = 300;
148148

149149
loop {
150+
// Check for stop signal at the beginning of each iteration
151+
tokio::select! {
152+
biased;
153+
_ = stop_sync_receiver.changed() => {
154+
log_trace!(
155+
self.logger,
156+
"Stopping initial chain sync.",
157+
);
158+
return;
159+
}
160+
_ = async {} => {}
161+
}
162+
150163
let channel_manager_best_block_hash = channel_manager.current_best_block().block_hash;
151164
let sweeper_best_block_hash = output_sweeper.current_best_block().block_hash;
152165
let onchain_wallet_best_block_hash =
@@ -226,7 +239,18 @@ impl BitcoindChainSource {
226239
e,
227240
backoff
228241
);
229-
tokio::time::sleep(Duration::from_secs(backoff)).await;
242+
// Sleep with stop signal check to allow immediate shutdown
243+
tokio::select! {
244+
biased;
245+
_ = stop_sync_receiver.changed() => {
246+
log_trace!(
247+
self.logger,
248+
"Stopping initial chain sync during backoff.",
249+
);
250+
return;
251+
}
252+
_ = tokio::time::sleep(Duration::from_secs(backoff)) => {}
253+
}
230254
backoff = std::cmp::min(backoff * 2, MAX_BACKOFF_SECS);
231255
} else {
232256
log_error!(
@@ -235,7 +259,18 @@ impl BitcoindChainSource {
235259
e,
236260
MAX_BACKOFF_SECS
237261
);
238-
tokio::time::sleep(Duration::from_secs(MAX_BACKOFF_SECS)).await;
262+
// Sleep with stop signal check to allow immediate shutdown
263+
tokio::select! {
264+
biased;
265+
_ = stop_sync_receiver.changed() => {
266+
log_trace!(
267+
self.logger,
268+
"Stopping initial chain sync during backoff.",
269+
);
270+
return;
271+
}
272+
_ = tokio::time::sleep(Duration::from_secs(MAX_BACKOFF_SECS)) => {}
273+
}
239274
}
240275
},
241276
}
@@ -260,6 +295,8 @@ impl BitcoindChainSource {
260295
let mut last_best_block_hash = None;
261296
loop {
262297
tokio::select! {
298+
biased;
299+
263300
_ = stop_sync_receiver.changed() => {
264301
log_trace!(
265302
self.logger,
@@ -268,17 +305,43 @@ impl BitcoindChainSource {
268305
return;
269306
}
270307
_ = chain_polling_interval.tick() => {
271-
let _ = self.poll_and_update_listeners(
272-
Arc::clone(&channel_manager),
273-
Arc::clone(&chain_monitor),
274-
Arc::clone(&output_sweeper)
275-
).await;
308+
tokio::select! {
309+
biased;
310+
_ = stop_sync_receiver.changed() => {
311+
log_trace!(
312+
self.logger,
313+
"Stopping polling for new chain data.",
314+
);
315+
return;
316+
}
317+
res = self.poll_and_update_listeners(
318+
Arc::clone(&channel_manager),
319+
Arc::clone(&chain_monitor),
320+
Arc::clone(&output_sweeper)
321+
) => {
322+
let _ = res;
323+
}
324+
}
276325
}
277326
_ = fee_rate_update_interval.tick() => {
278-
if last_best_block_hash != Some(channel_manager.current_best_block().block_hash) {
279-
let update_res = self.update_fee_rate_estimates().await;
280-
if update_res.is_ok() {
281-
last_best_block_hash = Some(channel_manager.current_best_block().block_hash);
327+
tokio::select! {
328+
biased;
329+
_ = stop_sync_receiver.changed() => {
330+
log_trace!(
331+
self.logger,
332+
"Stopping polling for new chain data.",
333+
);
334+
return;
335+
}
336+
res = async {
337+
if last_best_block_hash != Some(channel_manager.current_best_block().block_hash) {
338+
let update_res = self.update_fee_rate_estimates().await;
339+
if update_res.is_ok() {
340+
last_best_block_hash = Some(channel_manager.current_best_block().block_hash);
341+
}
342+
}
343+
} => {
344+
let _ = res;
282345
}
283346
}
284347
}

0 commit comments

Comments
 (0)