|
1 | | -# How to run the Leios Octoboer demo |
| 1 | +# How to run the Leios October demo |
2 | 2 |
|
3 | 3 | See https://github.com/IntersectMBO/ouroboros-consensus/issues/1701 for context. |
4 | 4 |
|
@@ -83,3 +83,129 @@ The script announces where (eg `Temporary data stored at: /run/user/1000/leios-o |
83 | 83 | **INFO**. |
84 | 84 | If you don't see any data in the 'Extracted and Merged Data Summary' table, then check the log files in the run's temporary directory. |
85 | 85 | This is where you might see messages about, eg, the missing `genesis-*.json` files, bad syntax in the `demoSchedule.json` file, etc. |
| 86 | +
|
| 87 | +# Details about the demo components |
| 88 | +
|
| 89 | +## The topology |
| 90 | +
|
| 91 | +For this first iteration, the demo topology is a simple linear chain. |
| 92 | +
|
| 93 | +```mermaid |
| 94 | +flowchart TD |
| 95 | + MockedUpstreamPeer --> Node0 --> MockedDownstreamPeer |
| 96 | +``` |
| 97 | +
|
| 98 | +**INFO**. |
| 99 | +In this iteration of the demo, the mocked downstream peer (see section below) is simply another node, ie Node1. |
| 100 | +
|
| 101 | +## The Praos traffic and Leios traffic |
| 102 | +
|
| 103 | +In this iteration of the demo, the Praos data and traffic is very simple. |
| 104 | +
|
| 105 | +- The Praos data is a simple chain provided by the Performance&Tracing team. |
| 106 | +- The mocked upstream peer serves each Praos block when the mocked wall-clock reaches the onset of their slots. |
| 107 | +- The Leios data is ten 12.5 megabyte EBs. |
| 108 | + They use the minimal number of txs necessary in order to accumulate 12.5 megabytes in order to minimize the CPU&heap overhead of the patched-in Leios logic, since this iteration of trhe demo is primarily intended to focus on networking. |
| 109 | +- The mocked upstream peer serves those EBs just prior to the onset of one of the Praos block's slot, akin to (relatively minor) ATK-LeiosProtocolBurst attack. |
| 110 | + Thus, the patched nodes are under significant Leios load when that Praos block begins diffusing. |
| 111 | +
|
| 112 | +## The demo tool |
| 113 | +
|
| 114 | +The `cabal run exe:leiosdemo202510 -- generate ...` command generates a SQLite database with the following schema. |
| 115 | +
|
| 116 | +``` |
| 117 | +CREATE TABLE ebPoints ( |
| 118 | + ebSlot INTEGER NOT NULL |
| 119 | + , |
| 120 | + ebHashBytes BLOB NOT NULL |
| 121 | + , |
| 122 | + ebId INTEGER NOT NULL |
| 123 | + , |
| 124 | + PRIMARY KEY (ebSlot, ebHashBytes) |
| 125 | + ) WITHOUT ROWID; |
| 126 | +CREATE TABLE ebTxs ( |
| 127 | + ebId INTEGER NOT NULL -- foreign key ebPoints.ebId |
| 128 | + , |
| 129 | + txOffset INTEGER NOT NULL |
| 130 | + , |
| 131 | + txHashBytes BLOB NOT NULL -- raw bytes |
| 132 | + , |
| 133 | + txBytesSize INTEGER NOT NULL |
| 134 | + , |
| 135 | + txBytes BLOB -- valid CBOR |
| 136 | + , |
| 137 | + PRIMARY KEY (ebId, txOffset) |
| 138 | + ) WITHOUT ROWID; |
| 139 | +``` |
| 140 | +
|
| 141 | +The contents of the generated database are determine by the given `manifest.json` file. |
| 142 | +For now, see the `demoManifest.json` file for the primary schema: each "`txRecipe`" is simply the byte size of the transaction. |
| 143 | +
|
| 144 | +The `generate` subcommand also generates a default `schedule.json`. |
| 145 | +Each EB will have two array elements in the schedule. |
| 146 | +The first number in an array element is a fractional slot, which determines when the mocked upstream peer will offer the payload. |
| 147 | +The rest of the array element is `MsgLeiosBlockOffer` if the EB's byte size is listed or `MsgLeiosBlockTxsOffer` if `null` is listed. |
| 148 | +
|
| 149 | +The secondary schema of the manifest allows for EBs to overlap (which isn't necessary for this demo, despite the pathced node fully supporting it). |
| 150 | +Overlap is created by an alternative "`txRecipe`", an object `{"share": "XYZ", "startIncl": 90, "stopExcl": 105}` where `"nickname": "XYZ"` was included in a preceding _source_ EB recipe. |
| 151 | +The `"startIncl`" and `"stopExcl"` are inclusive and exclusive indices into the source EB (aka a left-closed right-open interval); `"stopExcl"` is optional and defaults to the length of the source EB. |
| 152 | +With this `"share"` syntax, it is possible for an EB to include the same tx multiple times. |
| 153 | +That would not be a well-formed EB, but the prototype's behavior in response to such an EB is undefined---it's fine for the prototype to simply assume all the Leios EBs and txs in their closures are well-formed. |
| 154 | +(TODO check for this one, since it's easy to check for---just in the patched node itself, or also in `generate`?) |
| 155 | +
|
| 156 | +## The mocked upstream peer |
| 157 | +
|
| 158 | +The mocked upstream peer is a patched variant of `immdb-server`. |
| 159 | +
|
| 160 | +- It runs incomplete variant of LeiosNotify and LeiosFetch: just EBs and EB closures, nothing else (no EB announcements, no votes, no range requests). |
| 161 | +- It serves the EBs present in the given `--leios-db`; it sends Leios notificaitons offering the data according to the given `--leios-schedule`. |
| 162 | + See the demo tool section above for how to generate those files. |
| 163 | +
|
| 164 | +## The patched node/node-under-test |
| 165 | +
|
| 166 | +The patched node is a patched variant of `cardano-node`. |
| 167 | +All of the material changes were made in the `ouroboros-consensus` repo; the `cardano-node` changes are merely for integration. |
| 168 | +
|
| 169 | +- It runs the same incomplete variant of LeiosNotify and LeiosFetch as the mocked upstream peer. |
| 170 | +- The Leios fetch request logic is a fully fledged first draft, with four primary shortcomings. |
| 171 | + - It only handles EBs and EB closures, not votes and not range requests. |
| 172 | + - It retains a number of heap objects in proportion with the number of txs in EBs it has acquired. |
| 173 | + The real node---and so subsequent iterations of this prototype---must instead keep that data on disk. |
| 174 | + This first draft was intended to do so, but we struggled to invent the fetch logic algorithm with the constraint that some of its state was on-disk; that's currently presumed to be possible, but has been deferred to a iteration of the prototype. |
| 175 | + - It never discards any information. |
| 176 | + The real node---and so subsequent iterations of this prototype---must instead discard EBs and EB closures once their old enough, unless they are needed for the immutable chain. |
| 177 | + - Once it decides to fetch a set of txs from an upstream peer for the sake of some EB closure, it does not necessarily compose those into an optimal set of requests for that peer. |
| 178 | + We had not identified the potential for an optimizing algorithm here until writing this first prototype, so it just does something straight-forward and naive for now (which might be sufficient even for the real-node---we'll have to investigate later). |
| 179 | +
|
| 180 | +There are no other changes. |
| 181 | +In particular, that means the `ouroboros-network` mux doesn't not deprioritize Leios traffic. |
| 182 | +That change is an example of what this first prototype is intended to potentially demonstrate the need for. |
| 183 | +There are many such changes, from small to large. |
| 184 | +Some examples includes the following. |
| 185 | +
|
| 186 | +- The prototype uses SQLite3 with entirely default settings. |
| 187 | + Maybe Write-Ahead Log mode would be much preferable, etc. |
| 188 | +- The prototype uses a mutex to completely isolate every SQLite3 invocation---that's probably excessive, but was useful for some debugging during initial development (see the Engineering Notes appendix) |
| 189 | +- The prototype chooses several _magic numbers_ for resource utilization limits (eg max-bytes per reqeusted, max outsanding bytes per peer, fetch decision logic rate-limiting, txCache disk-bandwidth rate-limiting, etc). |
| 190 | + These all ultimately need to be tuned for the intended behvaiors on `mainnet`. |
| 191 | +- The prototype does not deduplicate the storage of EBs' closures when they share txs. |
| 192 | + This decision makes the LeiosFetch server a trivial single-pass instead of a join. |
| 193 | + However, it "wastes" disk space and disk bandwidth. |
| 194 | + It's left to future work to decide whether that's a worthwhile trade-off. |
| 195 | +
|
| 196 | +## The mocked downstream node |
| 197 | +
|
| 198 | +For simplicity, this is simply another instance of the patched node. |
| 199 | +In the future, it could be comparatively lightweight and moreover could replay an arbitrary schedule of downstream requests, dual to the mocked upstream peer's arbitrary schedule of upstream notifications. |
| 200 | +
|
| 201 | +# Appendix: Engineering Notes |
| 202 | +
|
| 203 | +This section summaries some lessons learned during the development of this prototype. |
| 204 | +
|
| 205 | +- Hypothesis: A SQLite connection will continue to hold SQLite's internal EXCLUSIVE lock _even after the transaction is COMMITed_ when the write transaction involved a prepared statement that was accidentally not finalized. |
| 206 | + That hypothesis was inferred from a painstaking debugging session, but I haven't not yet confirmed it in isolation. |
| 207 | + The bugfix unsuprisingly amounted to using `bracket` for all prepare/finalize pairs and all BEGIN/COMMIT pairs; thankfully our DB patterns seem to accommodate such bracketing. |
| 208 | +- The SQLite query plan optimizer might need more information in order to be reliable. |
| 209 | + Therefore at least one join (the one that copies out of `txCache` for the EbTxs identified in an in-memory table) was replaced with application-level iteration. |
| 210 | + It's not yet clear whether a one-time ANALYZE call might suffice, for example. |
| 211 | + Even if it did, it's also not yet clear how much bandwidth usage/latency/jitter/etc might be reduced. |
0 commit comments