Skip to content

Commit 6f393b7

Browse files
authored
Merge pull request #5399 from input-output-hk/bolt12/p2p-docs
Added P2P configuration and BF docs
2 parents 408d8ae + 7c84ea0 commit 6f393b7

File tree

3 files changed

+253
-21
lines changed

3 files changed

+253
-21
lines changed

README.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@ The general synopsis is as follows:
9393
[--shelley-kes-key FILEPATH]
9494
[--shelley-vrf-key FILEPATH]
9595
[--shelley-operational-certificate FILEPATH]
96+
[--start-as-non-producing-node]
9697
[--host-addr IPV4-ADDRESS]
9798
[--host-ipv6-addr IPV6-ADDRESS]
9899
[--port PORT]
@@ -115,6 +116,10 @@ The general synopsis is as follows:
115116

116117
* ``--shelley-operational-certificate`` - Optional path to the Shelley operational certificate.
117118

119+
* ``--start-as-non-producing-node`` - Optional flag to disable block production on node
120+
start. If credentials flags are passed the node will start block producing, however with
121+
this flag the node will only start block producing on SIGHUP (see `here <doc/reference/dynamic-block-forging.md>` for more details)
122+
118123
* ``--socket-path`` - Path to the socket file.
119124

120125
* ``--host-addr`` - Optionally specify your node's IPv4 address.

doc/getting-started/understanding-config-files.md

Lines changed: 213 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Understanding your configuration files and how to use them
22

3-
#### The topology.json file
3+
## The topology.json file
44

55
Tells your node to which nodes in the network it should talk to. A minimal version of this file looks like this:
66

@@ -21,23 +21,25 @@ Tells your node to which nodes in the network it should talk to. A minimal versi
2121

2222
Your __block-producing__ node must __ONLY__ talk to your __relay nodes__, and the relay nodes should talk to other relay nodes in the network. Go to https://explorer.cardano.org/relays/topology.json to find out IP addresses and ports of peers. The `topology.json` found at this link is updated once a week.
2323

24-
#### The P2P topology.json file
24+
## The P2P topology.json file
2525

2626
The P2P topology file specifies how to obtain the _root peers_ (or _bootstrapping
27-
peers_). These are root peers for the gossip which is a future enhancement of
28-
`ouroboros-network`.
29-
30-
* _local roots_: the node will try, in a best effort way, to have specified
31-
number of hot connections towards each local root peer group (by hot we mean
32-
connections taking active role in the consensus algorithm). Local roots
33-
should include local relays or local block producer node, all any other peers
34-
with which the node shall keep a connection;
27+
peers_).
28+
29+
* The term _local roots_ refers to a group of peer nodes with which a node will aim to
30+
maintain a specific number of active, or "hot" connections. These hot connections are
31+
those that play an active role in the consensus algorithm. Conversely, "warm"
32+
connections refer to those not yet actively participating in the consensus algorithm.
33+
34+
Local roots should comprise local relays or a local block producer node, and any other
35+
peers that the node needs to maintain a connection with. These connections are
36+
typically kept private.
3537
* _public roots_: additional bootstrapping nodes. They are either read from
3638
the configuration file directly, or from the chain. The configured ones
3739
will be used to pass a recent snapshot of peers need before the node caches up
3840
with the recent enough chain to construct root peers by itself.
3941

40-
The node does not guarantee to have a connection with each public root or,
42+
The node does not guarantee to have a connection with each public root,
4143
unlike for local ones, but by being present in the set it gets a chance to have
4244
an outbound connection towards that peer.
4345

@@ -98,12 +100,31 @@ A minimal version of this file looks like this:
98100

99101
* Local roots groups shall be non-overlapping.
100102

103+
* The advertise parameter instructs a node about the acceptability of sharing its address
104+
through Peer Sharing (which we'll explain in more detail in a subsequent section). In
105+
essence, if a node has activated Peer Sharing, it can receive requests from other nodes
106+
seeking peers. However, it will only disclose those peers for which it has both local
107+
and remote permissions.
108+
109+
Local permission corresponds to the value of the advertise parameter. On the other
110+
hand, 'remote permission' is tied to the `PeerSharing` value associated with the
111+
remote address, which is ascertained after the initial handshake between nodes.
112+
113+
* Local roots should not be greater than the `TargetNumberOfKnownPeers`.
114+
If they are they will get clamped to the limit.
115+
101116
Your __block-producing__ node must __ONLY__ talk to your __relay nodes__, and the relay node should talk to other relay nodes in the network.
102117

103-
You __can__ tell the node that the topology configuration file changed by sending a SIGHUP
104-
signal to the `cardano-node` process, e.g. `pkill -HUP cardano-node`. After receiving the
105-
signal, `cardano-node` will re-read the file and restart all DNS resolution. Please
106-
**note** that this only applies to the topology configuration file!
118+
You have the option to notify the node of any changes to the topology configuration file
119+
by sending a SIGHUP signal to the `cardano-node` process. This can be done, for example,
120+
with the command `pkill -HUP cardano-node`. Upon receiving the signal, the `cardano-node`
121+
will re-read the configuration file and restart all DNS resolutions.
122+
123+
Please be aware that this procedure is specific to the topology configuration file, not
124+
the node configuration file. Additionally, the SIGHUP signal will prompt the system to
125+
re-read the block forging credentials file paths and attempt to fetch them to initiate
126+
block forging. If this process fails, block forging will be disabled. To re-enable block
127+
forging, ensure that the necessary files are present.
107128

108129
One can disable ledger peers by setting the `useLedgerAfterSlot` to a negative
109130
value.
@@ -119,7 +140,7 @@ node will connect to relays registered on the chain, and churn through them by
119140
randomly picking new peers (weighted by stake distribution) and forgetting 20%
120141
least performing ones.
121142

122-
#### The genesis.json file
143+
## The genesis.json file
123144

124145
The genesis file is generated with the `cardano-cli` by reading a `genesis.spec.json` file, which is out of scope for this document.
125146
But it is important because it is used to set:
@@ -237,23 +258,23 @@ Here is a brief description of each parameter. You can learn more in the [spec](
237258
| securityParam | Security parameter k |
238259

239260

240-
#### The config.json file
261+
## The config.json file
241262

242263
The default `config.json` file that we downloaded is shown below.
243264

244265
This file has __4__ sections that allow you to have full control on what your node does and how the information is presented.
245266

246267
__NOTE Due to how the config.json file is generated, fields on the real file are shown in a different (less coherent) order. Here we present them in a more structured way__
247268

248-
#### Basic Node Configuration.
269+
### Basic Node Configuration.
249270

250271
First section relates the basic node configuration parameters. Make sure you have to `TPraos`as the protocol, the correct path to the `mainnet-shelley-genesis.json` file, `RequiresMagic`for its use in a testnet.
251272

252273
"Protocol": "TPraos",
253274
"GenesisFile": "mainnet-shelley-genesis.json",
254275
"RequiresNetworkMagic": "RequiresMagic",
255276

256-
#### Update parameters
277+
### Update parameters
257278

258279
This protocol version number gets used by block producing nodes as part of the system for agreeing on and synchronising protocol updates. You just need to be aware of the latest version supported by the network. You don't need to change anything here.
259280

@@ -262,7 +283,7 @@ This protocol version number gets used by block producing nodes as part of the s
262283
"LastKnownBlockVersion-Minor": 0,
263284

264285

265-
#### Tracing
286+
### Tracing
266287

267288
`Tracers` tell your node what information you are interested in when logging. Like switches that you can turn ON or OFF according the type and quantity of information that you are interesetd in. This provides fairly coarse grained control, but it is relatively efficient at filtering out unwanted trace output.
268289

@@ -347,7 +368,7 @@ Also enable the EKG backend if you want to use the EKG or Prometheus monitoring
347368
},
348369
```
349370

350-
#### Fine grained logging control
371+
### Fine grained logging control
351372

352373
It is also possible to have more fine grained control over filtering of trace output, and to match and route trace output to particular backends. This is less efficient than the coarse trace filters above but provides much more precise control. `options`:
353374

@@ -370,3 +391,174 @@ It is also possible to have more fine grained control over filtering of trace ou
370391
}
371392
}
372393
```
394+
395+
### Peer-to-Peer Parameters & Tracers
396+
397+
To run a node in P2P mode set `EnableP2P` to `true` (_the default is `False`_) in the
398+
configuration file. You will also need to specify the topology in a new format which is
399+
described above.
400+
401+
There are a few new tracers and configuration options which you can set (listed below by
402+
component):
403+
404+
#### Outbound Governor
405+
406+
The outbound governor is responsible for satisfying targets of root peers, known (_cold_,
407+
_warm_ and _hot_), established (_warm_ & _hot_) and active peers (synonym for _hot_ peers)
408+
and local root peers. The primary way to configure them is by setting the following
409+
options:
410+
411+
* `TargetNumberOfRootPeers` (_default value: `100`_) - a minimal number of root peers
412+
(unlike other targets this one is one sided, e.g. a node might have more root peers
413+
* `TargetNumberOfKnownPeers` (_default value: `100`_) - a target of known peers (must be
414+
larger or equal to `TargetNumberOfRootPeers`)
415+
* `TargetNumberOfEstablishedPeers` (_default value: `50`_) - a target of all established
416+
peers (including local roots, ledger peers)
417+
* `TargetNumberOfActivePeers` (_default value: `20`_) - a target for _hot_ peers which
418+
engage in the consensus protocol
419+
420+
Let us note two more targets. In the topology file you may include local root peers.
421+
This is a list of groups of peers, each group comes with its own valency. The outbound
422+
governor will maintain a connection with every local root peer, and will enforce that at
423+
least the specified number of them (the valency) are _hot_. Thus the
424+
`TargetNumberOfKnownPeers` , `TargetNumberOfEstablishedPeers` and
425+
`TargetNumberOfActivePeers` must be large enough to accommodate local root peers.
426+
427+
The following traces can be enabled:
428+
429+
* `TracePeerSelection` (_by default on_) - tracks selection of upstream peers done by the
430+
_outbound-governor_. **Warm peers** are ones with which we have an open connection but
431+
don't engage in consensus protocol, **hot peers** are peers which engage in consensus
432+
protocol (via `chain-sync`, `block-fetch` and `tx-submission` mini-protocols), **cold
433+
peers** are ones which we know about but the node doesn't have an established
434+
connection. Note that the notions of _hot_, _warm_ and _cold_ are only related to usage
435+
of initiator sides of mini-protocols in a connection (which can be either inbound or
436+
outbound).
437+
* `TracePeerSelectionCounters` (_by default on_) - traces how many cold / warm / hot /
438+
local root peers the node has, it's also available via ekg.
439+
* `TracePeerStateActions` (_by default on_) - includes traces from a component which
440+
executes peer promotion / demotions between cold / warm & hot states.
441+
* `TracePublicRootPeers` (_by default off_) - traces information about root / ledger peers
442+
(e.g. ip addresses or dns names of ledger peers, dns resolution)
443+
* `DebugPeerSelectionInitiator` and `DebugPeerSelectionInitiatorResponder` (_by default
444+
off_) - a debug tracers which log the information about current state of the _outbound
445+
governor_.
446+
447+
At this point [haddock
448+
documentation](https://input-output-hk.github.io/ouroboros-network/ouroboros-network/Ouroboros-Network-PeerSelection-Governor.html)
449+
of the outbound governor is available.
450+
451+
#### Peer Sharing
452+
453+
Peer Sharing is a novel feature that provides an additional method for the Outbound
454+
Governor to reach its targets for known peers. With Peer Sharing, the node can request
455+
peer information from other nodes with which it has an established connection.
456+
457+
**IMPORTANT:** _Peer Sharing_ is an experimental feature that is turned off by default.
458+
Please be aware that until the availability of genesis & eclipse evasion, this feature may
459+
leave a node vulnerable to eclipse attacks.
460+
461+
The main method for configuring Peer Sharing involves setting the following option:
462+
463+
- `PeerSharing` (default value: `NoPeerSharing`) - This option can take 3 possible values:
464+
* `NoPeerSharing`: Peer Sharing is disabled, which means the node won't request peer
465+
information from any other node, and will not respond to such requests from others
466+
(the mini-protocol won't even start);
467+
* `PeerSharingPrivate`: Peer Sharing is enabled, meaning the node will query other
468+
nodes for peers. However, during the handshake process, it will inform other nodes
469+
not to share its address.
470+
* `PeerSharingPublic`: Peer Sharing is enabled and the node will notify other nodes
471+
that it is permissible to share its address.
472+
473+
The `PeerSharing` flag interacts with `PeerAdvertise` (`advertise` flag in the topology
474+
file) values as follows:
475+
476+
`AdvertisePeer` (`advertise: true`) is local to the configuration of a specific node. A
477+
node might be willing to share those peers it has set as `PeerAdvertise`. Conversely,
478+
`PeerSharing` is about whether the peer (itself) is willing to participate in
479+
`PeerSharing` or allows others to share its address.
480+
481+
`PeerSharing` takes precedence over `AdvertisePeer`. Consider the following example:
482+
483+
A Block Producer (BP) has the `NoPeerSharing` flag value (which means it won't participate
484+
in Peer Sharing or run the mini-protocol). A Relay node has the BP set as a local peer
485+
configured as `AdvertisePeer` (likely a misconfiguration). When the handshake between the
486+
BP and the Relay occurs, the Relay will see that the BP doesn't want to participate in
487+
Peer Sharing. As a result, it won't engage in peer sharing with it or share its details
488+
with others.
489+
490+
The `combinePeerInformation` function determines the sharing interaction semantics between
491+
the two flags. Please take a look to better understand how the two values combine.
492+
493+
```haskell
494+
-- Combine a 'PeerSharing' value and a 'PeerAdvertise' value into a
495+
-- resulting 'PeerSharing' that can be used to decide if we should
496+
-- share or not the given Peer. According to the following rules:
497+
--
498+
-- - If no PeerSharing value is known then there's nothing we can assess
499+
-- - If a peer is not participating in Peer Sharing ignore all other information
500+
-- - If a peer said it wasn't okay to share its address, respect that no matter what.
501+
-- - If a peer was privately configured with DoNotAdvertisePeer respect that no matter
502+
-- what.
503+
--
504+
combinePeerInformation :: PeerSharing -> PeerAdvertise -> PeerSharing
505+
combinePeerInformation NoPeerSharing _ = NoPeerSharing
506+
combinePeerInformation PeerSharingPrivate _ = PeerSharingPrivate
507+
combinePeerInformation PeerSharingPublic DoNotAdvertisePeer = PeerSharingPrivate
508+
combinePeerInformation _ _ = PeerSharingPublic
509+
```
510+
511+
#### Inbound Governor
512+
513+
The inbound governor is maintaining responder side of all mini-protocols. Unlike the
514+
outbound governor it is a purely responsive component which reacts to actions of remote
515+
peer (its outbound governor).
516+
517+
* `TraceInboundGovernor` (_by default on_) - traces information about inbound connection,
518+
e.g. we track if the remote side is using our node as _warm_ or _hot peer_, traces when
519+
we restart a responder.
520+
* `TraceInboundGovernorCounters` (_by default on_) - traces number of peers which use the
521+
node as `cold`, `warm` or `hot` (which we call `remote cold`, `remote warm` or `remote
522+
hot`). Note that we only know if a peer is in the remote cold state if we connected to
523+
that peer and it's not using the connection. This information is also available via
524+
ekg.
525+
* `TraceInboundGovernorTransitions` (_by default on_) - a debug tracer which traces
526+
transitions between remote cold, remote warm and remote hot states.
527+
528+
The inbound governor is documented in [The Shelley Networking
529+
Protocol](https://input-output-hk.github.io/ouroboros-network/pdfs/network-spec) (section
530+
4.5).
531+
532+
#### Connection Manager
533+
534+
Connection manager tracks the state of all tcp connections, and enforces various timeouts,
535+
e.g. when the connection is not used by either of the sides. The following traces are
536+
available:
537+
538+
* `TraceConnectionManager` (_by default on_) - traces information about new inbound or
539+
outbound connection, connection errors.
540+
* `TraceConnectionManagerCounters` (_by default on_) - traces the number of inbound,
541+
outbound, duplex (connections which negotiated P2P mode and can use a connection in full
542+
duplex mode), full duplex (connections which run mini-protocols in both directions, e.g.
543+
at least _warm_ and _remote warm_ at the same time), unidirectional connections
544+
(connections with non p2p nodes, or p2p nodes which configured themselves as initiator
545+
only nodes).
546+
* `TraceConnectionManagerTransitions` (_by default on_) - a low level traces which traces
547+
connection state changes in the connection manager state machine.
548+
549+
The connection manager is documented in [The Shelley Networking
550+
Protocol](https://input-output-hk.github.io/ouroboros-network/pdfs/network-spec) (section
551+
4).
552+
553+
#### Ledger Peers
554+
555+
Ledger peers are the relays registered on the chain. Currently we use square of the stake
556+
distribution to randomly pick new ledger peers. You can enable `TraceLedgerPeers` (_by
557+
default off_) to log actions taken by this component.
558+
559+
#### Server
560+
561+
The accept loop. You can enable `TraceServer` to log its actions or errors it encounters
562+
(_by default it is off_, however we suggest to turn it on) .
563+
564+
**Please note that this version contains no breaking changes**
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Peer-to-Peer Block Production and Block Forging Configuration
2+
3+
For redundancy purposes, Stake Pool Operators (SPOs) often operate backup block production
4+
nodes. However, with the introduction of peer-to-peer (P2P) nodes, the previous approach
5+
of using firewall rules to prevent relays from connecting to the backup node, and thus
6+
stop duplicate blocks from being propagated, will no longer be effective.
7+
8+
In the P2P environment, relays can repurpose inbound connections from the block producer,
9+
leading to potential complications. To address this issue, we've introduced a way for a
10+
block producing node to be started (and stopped) without immediately producing blocks. It
11+
will only begin (or stop) block production when it receives a SIGHUP signal.
12+
13+
## Enabling and Disabling Block Forging
14+
15+
Block Forging can be toggled on and off using the SIGHUP signal. Sending such a signal
16+
triggers the node to read the block forging credential files. Not that this will also
17+
trigger the re-reading of the topology configuration file, so connections might be lost.
18+
19+
As these credential files are provided through CLI flags, they cannot be removed without
20+
restarting the node. To disable block forging, you need to move, rename, or delete the
21+
file at the specified path (for the credential flags), and then send the SIGHUP signal.
22+
The code will then recognize that the specified files are missing and disable block
23+
forging, recording the appropriate log messages.
24+
25+
## Starting as a Non-Producing Node
26+
27+
If you wish to start a block producing node (i.e., passing the credentials in the
28+
respective flags) without it acting as a block producer immediately, you can use the
29+
`--start-as-non-producing-node` flag. This will run the node with credentials as a
30+
standard node. However, upon receiving the SIGHUP signal, it will read the credential
31+
files and start block forging.
32+
33+
This configuration allows for safer management of block production in a P2P environment,
34+
reducing the risk of duplicate blocks and improving overall network stability.
35+

0 commit comments

Comments
 (0)