-
Notifications
You must be signed in to change notification settings - Fork 421
Parallelize ChannelMonitor loading from async KVStores
#4147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Parallelize ChannelMonitor loading from async KVStores
#4147
Conversation
|
👋 Hi! This PR is now in draft status. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4147 +/- ##
==========================================
- Coverage 89.33% 89.29% -0.04%
==========================================
Files 180 180
Lines 138055 138188 +133
Branches 138055 138188 +133
==========================================
+ Hits 123326 123393 +67
- Misses 12122 12191 +69
+ Partials 2607 2604 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
4a773c9 to
31fbd16
Compare
|
This doesn't actually work. monitor loading is (at least for a semi-local disk) CPU-bound, so we really need to spawn each monitor load task rather than having one task. |
64412a5 to
2c807b9
Compare
2c807b9 to
0a2547a
Compare
|
This requires GATs which requires an MSRV bump (to 1.64), but we're planning on doing that soon anyway. |
0a2547a to
a4a778c
Compare
|
🔔 1st Reminder Hey @jkczyz! This PR has been waiting for your review. |
a4a778c to
3346cba
Compare
|
🔔 2nd Reminder Hey @jkczyz! This PR has been waiting for your review. |
|
🔔 3rd Reminder Hey @jkczyz! This PR has been waiting for your review. |
|
🔔 4th Reminder Hey @jkczyz! This PR has been waiting for your review. |
|
🔔 5th Reminder Hey @jkczyz! This PR has been waiting for your review. |
|
🔔 6th Reminder Hey @jkczyz! This PR has been waiting for your review. |
|
🔔 7th Reminder Hey @jkczyz! This PR has been waiting for your review. |
|
🔔 8th Reminder Hey @jkczyz! This PR has been waiting for your review. |
|
🔔 9th Reminder Hey @jkczyz! This PR has been waiting for your review. |
|
🔔 10th Reminder Hey @jkczyz! This PR has been waiting for your review. |
|
🔔 11th Reminder Hey @jkczyz! This PR has been waiting for your review. |
|
🔔 12th Reminder Hey @jkczyz! This PR has been waiting for your review. |
|
🔔 13th Reminder Hey @jkczyz! This PR has been waiting for your review. |
Now that it has the same MSRV as everything else in the workspace, it doesn't need to live on its own.
Now that our MSRV is above 1.68 we can use the `pin!` macro to avoid having to `Box` various futures, avoiding some allocations, especially in `lightning-net-tokio`, which happens in a tight loop.
Now that our MSRV is 1.75, we can return `impl Trait` from trait methods. Here we use this to clean up `KVStore` methods, dropping the `Pin<Box<dyn ...>>` we had to use to have trait methods return a concrete type. Sadly, there's two places where we can't drop a `Box::pin` until we switch to edition 2024.
Now that our MSRV is 1.75, we can return `impl Trait` from trait methods. Here we use this to clean up `lightning-block-sync` trait methods, dropping the `Pin<Box<dyn ...>>` we had to use to have trait methods return a concrete type.
Now that our MSRV is 1.75, we can return `impl Trait` from trait methods. Here we use this to clean up `lightning` crate trait methods, dropping the `Pin<Box<dyn ...>>`/`AsyncResult` we had to use to have trait methods return a concrete type.
Now that we have an MSRV that supports returning `impl Trait` in trait methods, we can use it to avoid the `Box<dyn ...>` we had spewed all over our BOLT 11 invoice serialization.
3346cba to
ba55b75
Compare
Reading `ChannelMonitor`s on startup is one of the slowest parts of
LDK initialization. Now that we have an async `KVStore`, there's no
need for that, we can simply paralellize their loading, which we do
here.
Sadly, because Rust futures are pretty unergonomic, we have to add
some `unsafe {}` here, but arguing its fine is relatively
straightforward.
ba55b75 to
5e45e0e
Compare
|
Rebased on #4175, which avoids some further conflicts. |
`tokio::spawn` can be use both to spawn a forever-running background task or to spawn a task which gets `poll`ed independently and eventually returns a result which the callsite wants. In LDK, we have only ever needed the first, and thus didn't bother defining a return type for `FutureSpawner::spawn`. However, in the next commit we'll start using `FutureSpawner` in a context where we actually do want the spawned future's result. Thus, here, we add a result output to `FutureSpawner::spawn`, mirroring the `tokio::spawn` API.
`MonitorUpdatingPersister::read_all_channel_monitors_with_updates` was made to do the IO operations in parallel in a previous commit, however in practice this doesn't provide material parallelism for large routing nodes. Because deserializing `ChannelMonitor`s is the bulk of the work (when IO operations are sufficiently fast), we end up blocked in single-threaded work nearly the entire time. Here, we add an alternative option - a new `read_all_channel_monitors_with_updates_parallel` method which uses the `FutureSpawner` to cause the deserialization operations to proceed in parallel.
When reading `ChannelMonitor`s from a `MonitorUpdatingPersister` on startup, we have to make sure to load any `ChannelMonitorUpdate`s and re-apply them as well. For users of async persistence who don't have any `ChannelMonitorUpdate`s (e.g. because they set `maximum_pending_updates` to 0 or, in the future, we avoid persisting updates for small `ChannelMonitor`s), this means two round-trips to the storage backend, one to load the `ChannelMonitor` and one to try to read the next `ChannelMonitorUpdate` only to have it fail. Instead, here, we use `KVStore::list` to fetch the list of stored `ChannelMonitorUpdate`s, which for async `KVStore` users allows us to parallelize the list of update fetching and the `ChannelMonitor` loading itself. Then we know exactly when to stop reading `ChannelMonitorUpdate`s, including reading none if there are none to read. This also avoids relying on `KVStore::read` correctly returning `NotFound` in order to correctly discover when to stop reading `ChannelMonitorUpdate`s.
When reading `ChannelMonitor`s from a `MonitorUpdatingPersister` on startup, we have to make sure to load any `ChannelMonitorUpdate`s and re-apply them as well. Now that we know which `ChannelMonitorUpdate`s to load from `list`ing the entries from the `KVStore` we can parallelize the reads themselves, which we do here. Now, loading all `ChannelMonitor`s from an async `KVStore` requires only three full RTTs - one to list the set of `ChannelMonitor`s, one to both fetch the `ChanelMonitor` and list the set of `ChannelMonitorUpdate`s, and one to fetch all the `ChannelMonitorUpdate`s (with the last one skipped when there are no `ChannelMonitorUpdate`s to read).
5e45e0e to
0503f6b
Compare
|
Marking draft until the parent is marged. |
Based on #4146 just cause I don't want to resolve conflicts.