-
Notifications
You must be signed in to change notification settings - Fork 112
subservers: fail LiT startup when integrated sub-server boot fails #1183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| # Release Notes | ||
|
|
||
| - [Lightning Terminal](#lightning-terminal) | ||
| - [Bug Fixes](#bug-fixes) | ||
| - [Functional Changes/Additions](#functional-changesadditions) | ||
| - [Technical and Architectural Updates](#technical-and-architectural-updates) | ||
| - [Integrated Binary Updates](#integrated-binary-updates) | ||
| - [LND](#lnd) | ||
| - [Loop](#loop) | ||
| - [Pool](#pool) | ||
| - [Faraday](#faraday) | ||
| - [Taproot Assets](#taproot-assets) | ||
| - [Contributors](#contributors-alphabetical-order) | ||
|
|
||
| ## Lightning Terminal | ||
|
|
||
| ### Bug Fixes | ||
|
|
||
| ### Functional Changes/Additions | ||
|
|
||
| * [PR](https://github.com/lightninglabs/lightning-terminal/pull/1183): LiT now | ||
| fails fast if any integrated sub-server cannot start. Startup errors from | ||
| integrated sub-servers are propagated directly to LiT, which aborts launch | ||
| and records the failure status. | ||
|
|
||
| ### Technical and Architectural Updates | ||
|
|
||
| ## RPC Updates | ||
|
|
||
| ## Integrated Binary Updates | ||
|
|
||
| ### LND | ||
|
|
||
| ### Loop | ||
|
|
||
| ### Pool | ||
|
|
||
| ### Faraday | ||
|
|
||
| ### Taproot Assets | ||
|
|
||
| # Contributors (Alphabetical Order) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,6 +4,7 @@ import ( | |
| "context" | ||
| "fmt" | ||
| "io/ioutil" | ||
| "strings" | ||
| "sync" | ||
| "time" | ||
|
|
||
|
|
@@ -104,13 +105,16 @@ func (s *Manager) GetServer(name string) (SubServer, bool) { | |
| } | ||
|
|
||
| // StartIntegratedServers starts all the manager's sub-servers that should be | ||
| // started in integrated mode. | ||
| // started in integrated mode. An error is returned if any integrated sub-server | ||
| // fails to start. | ||
| func (s *Manager) StartIntegratedServers(lndClient lnrpc.LightningClient, | ||
| lndGrpc *lndclient.GrpcLndServices, withMacaroonService bool) { | ||
| lndGrpc *lndclient.GrpcLndServices, withMacaroonService bool) error { | ||
|
|
||
| s.mu.Lock() | ||
| defer s.mu.Unlock() | ||
|
|
||
| var errs []string | ||
|
|
||
| for _, ss := range s.servers { | ||
| if ss.Remote() { | ||
| continue | ||
|
|
@@ -126,11 +130,19 @@ func (s *Manager) StartIntegratedServers(lndClient lnrpc.LightningClient, | |
| ) | ||
| if err != nil { | ||
| s.statusServer.SetErrored(ss.Name(), err.Error()) | ||
| errs = append(errs, fmt.Sprintf("%s: %v", ss.Name(), err)) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we want to return an error imminently here, and not proceed to start all sub-servers before we return the error. The motivation for that is that we want |
||
| continue | ||
| } | ||
|
|
||
| s.statusServer.SetRunning(ss.Name()) | ||
| } | ||
|
|
||
| if len(errs) > 0 { | ||
| return fmt.Errorf("failed starting integrated sub-server(s): %s", | ||
| strings.Join(errs, "; ")) | ||
| } | ||
|
|
||
| return nil | ||
| } | ||
|
|
||
| // ConnectRemoteSubServers creates connections to all the manager's sub-servers | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -768,9 +768,12 @@ func (g *LightningTerminal) start(ctx context.Context) error { | |
| // Both connection types are ready now, let's start our sub-servers if | ||
| // they should be started locally as an integrated service. | ||
| createDefaultMacaroons := !g.cfg.statelessInitMode | ||
| g.subServerMgr.StartIntegratedServers( | ||
| err = g.subServerMgr.StartIntegratedServers( | ||
| g.basicClient, g.lndClient, createDefaultMacaroons, | ||
| ) | ||
| if err != nil { | ||
| return fmt.Errorf("could not start integrated sub-servers: %w", err) | ||
| } | ||
|
Comment on lines
+774
to
+776
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just returning an error here will currently actually not shutdown Instead we want to make sure |
||
|
|
||
| err = g.startInternalSubServers(ctx, !g.cfg.statelessInitMode) | ||
| if err != nil { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only want errors in the
tapdsub-server to cause the startup process to error, and not any sub-server. The motivation for why this should specifically targettapdright now, is that we've realized thattapdis critically coupled tolndin the scenario that the user has taproot-asset channels, in the sense thatlndcan't function iftapdregisters as the aux-unit, but then shuts down.Therefore, I suggest that you add a
mapof "critical" sub-server names as a param to theStartIntegratedServersfunction. If a sub-server then errors during the startup, you'll check if that sub-server name exists in the passedmap, and if so return an error.We actually originally designed this function specifically to not error the whole startup process if any of the integrated sub-servers fails to start. The motivation behind that is that we do not want bugs in one sub-server, to disable users from using the other sub-servers until a fix for the erroring sub-server has been released. I.e. a bug that causes
poolto error on startup, shouldn't disable people from performing loops until it's been fixed. Of course users can alawysdisablepoolin that scenario, but for UX purposes we do not want require users to understand that it'spoolthat's actually failing the startup process, and then additionally understand how theydisablepool, as this adds friction.