chore: Optional proofs #20

kevaundray · 2025-11-11T23:03:52Z

Issue Addressed

Which issue # does this PR address?

Proposed Changes

Please list or describe the changes introduced by this PR.

Additional Info

Please provide any additional information. For example, future considerations
or information useful for reviewers.

…gen flag

* sigp#7974 Display `0` if optimistic sync and `1` for normal sync Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io> Co-Authored-By: chonghe <44791194+chong-he@users.noreply.github.com>

* sigp#7829 Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io> Co-Authored-By: chonghe <44791194+chong-he@users.noreply.github.com>

Co-Authored-By: antondlr <anton@sigmaprime.io>

@jimmygchen

Fix an issue detected by @jimmygchen that occurs when checkpoint sync is aborted midway and then later restarted. The characteristic error is something like: > Nov 13 00:51:35.832 ERROR Database write failed error: Hdiff(LessThanStart(Slot(1728288), Slot(1728320))), action: "reverting blob DB changes" Nov 13 00:51:35.833 WARN Hot DB pruning failed error: DBError(HotColdDBError(Rollback)) This issue has existed since v7.1.0. Delete snapshot/diff in the case where `hot_storage_strategy` fails. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

Fix the span on execution payload verification (newPayload), by creating a new span rather than using the parent span. Using the parent span was incorrectly associating the time spent verifying the payload with `from_signature_verified_components`. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

…8413) Addressed this comment here: sigp#6837 (comment) Lighthouse can only checkpoint sync from a server that can serve blob sidecars, which means they need to be at least custdoying 50% of columns (semi-supernodes) This PR lifts this constraint, as blob sidecar endpoint is getting deprecated in Fulu, and we plan to fetch the checkpoint data columns from peers (sigp#6837) Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

…igp#8391) Take 2 of sigp#8390. Fixes the race condition properly instead of propagating the error. I think this is a better alternative, and doesn't seem to look that bad. * Lift node id loading or generation from `NetworkService ` startup to the `ClientBuilder`, so that it can be used to compute custody columns for the beacon chain without waiting for Network bootstrap. I've considered and implemented a few alternatives: 1. passing `node_id` to beacon chain builder and compute columns when creating `CustodyContext`. This approach isn't good for separation of concerns and isn't great for testability 2. passing `ordered_custody_groups` to beacon chain. `CustodyContext` only uses this to compute ordered custody columns, so we might as well lift this logic out, so we don't have to do error handling in `CustodyContext` construction. Less tests to update;. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

N/A The difference is computed by taking the difference of expected with received. We were doing the inverse. Thanks to Yassine for finding the issue. Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>

Once sigp#8271 is merged, CI will only cover tests for `RECENT_FORKS` (prev, current, next) To make sure functionalities aren't broken for prior forks, we run tests for these forks nightly. They can also be manually triggered. Tested via manual trigger here: https://github.com/jimmygchen/lighthouse/actions/runs/18896690117 <img width="826" height="696" alt="image" src="https://github.com/user-attachments/assets/afdfb03b-a037-4094-9f1b-7466c0800f6b" /> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>

The merge queue is failing due to md lint changes: https://github.com/sigp/lighthouse/actions/runs/19491272535/job/55783746002 This PR fixes the lint. I'm targeting the release branch so we can get things merged for release tomorrow, and we'll merge back down to `unstable`. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

We want to not require checkpoint sync starts to include the required custody data columns, and instead fetch them from p2p. Closes sigp#6837 The checkpoint sync slot can: 1. Be the first slot in the epoch, such that the epoch of the block == the start checkpoint epoch 2. Be in an epoch prior to the start checkpoint epoch In both cases backfill sync already fetches that epoch worth of blocks with current code. This PR modifies the backfill import filter function to allow to re-importing the oldest block slot in the DB. I feel this solution is sufficient unless I'm missing something. ~~I have not tested this yet!~~ Michael has tested this and it works. Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com> Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

This hot fix release includes the following fixes: * sigp#8388 * sigp#8406 * sigp#8391 * sigp#8413 Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Co-Authored-By: Age Manning <Age@AgeManning.com>

This is a `tracing`-driven optimisation. While investigating why Lighthouse is slow to send `newPayload`, I found a suspicious 13ms of computation on the hot path in `gossip_block_into_execution_pending_block_slashable`: <img width="1998" height="1022" alt="headercalc" src="https://github.com/user-attachments/assets/e4f88c1a-da23-47b4-b533-cf5479a1c55c" /> Looking at the current implementation we can see that the _only_ thing that happens prior to calling into `from_gossip_verified_block` is the calculation of a `header`. We first call `SignatureVerifiedBlock::from_gossip_verified_block_check_slashable`: https://github.com/sigp/lighthouse/blob/261322c3e3ee467c9454fa160a00866439cbc62f/beacon_node/beacon_chain/src/block_verification.rs#L1075-L1076 Which is where the `header` is calculated prior to calling `from_gossip_verified_block`: https://github.com/sigp/lighthouse/blob/261322c3e3ee467c9454fa160a00866439cbc62f/beacon_node/beacon_chain/src/block_verification.rs#L1224-L1226 Notice that the `header` is _only_ used in the case of an error, yet we spend time computing it every time! This PR moves the calculation of the header (which involves hashing the whole beacon block, including the execution payload), into the error case. We take a cheap clone of the `Arc`'d beacon block on the hot path, and use this for calculating the header _only_ in the case an error actually occurs. This shaves 10-20ms off our pre-newPayload delays, and 10-20ms off every block processing 🎉 Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

Currently whenever we build the `Dockerfile` file for local development using kurtosis , it recompiles everything on my laptop, even if no changes are made. This takes about 120 seconds on my laptop (might be faster on others). Conservatively, I created a new Dockerfile.dev, so that the original file is kept the same, even though its pretty similar. This uses `--mount-type=cache` saving the target and registry folder across builds. **Usage** ```sh docker build -f Dockerfile.dev -t lighthouse:dev . ``` Co-Authored-By: Kevaundray Wedderburn <kevtheappdev@gmail.com>

Update `reqwest` to 0.12 so we only depend on a single version. This should slightly improve compile times and reduce binary bloat. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

When developing locally with kurtosis a typical dev workflow is: loop: - Build local lighthouse docker image - Run kurtosis - Observe bug - Fix code The docker build step would download and build all crates. Docker docs suggests an optimization to cache build artifacts, see https://docs.docker.com/build/cache/optimize/#use-cache-mounts I have tested and it's like building Lighthouse outside of a docker environment 🤤 The docker build time after changing one line in the top beacon_node crate is 50 seconds on my local machine ❤️ The release path is un-affected. Do you have worries this can affect the output of the release binaries? This is too good of an improvement to keep it in a separate Dockerfile. Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>

Since merging this PR, we don't need `--checkpoint-blobs`, even prior to Fulu: - sigp#8417 This PR removes the mandatory check for blobs prior to Fulu, enabling simpler manual checkpoint sync. Co-Authored-By: Michael Sproul <michael@sigmaprime.io> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>

Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io>

Consolidate our property-testing around `proptest`. This PR was written with Copilot and manually tweaked. Co-Authored-By: Michael Sproul <michael@sproul.xyz> Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

Fixes sigp#7785 - [x] Update all integration tests with >1 files to follow the `main` pattern. - [x] `crypto/eth2_key_derivation/tests` - [x] `crypto/eth2_keystore/tests` - [x] `crypto/eth2_wallet/tests` - [x] `slasher/tests` - [x] `common/eth2_interop_keypairs/tests` - [x] `beacon_node/lighthouse_network/tests` - [x] Set `debug_assertions` to false on `.vscode/settings.json`. - [x] Document how to make rust analyzer work on integration tests files. In `book/src/contributing_setup.md` --- Tracking a `rust-analyzer.toml` with settings like the one provided in `.vscode/settings.json` would be nicer. But this is not possible yet. For now, that config should be a good enough indicator for devs using editors different to VSCode. Co-Authored-By: Daniel Ramirez-Chiquillo <hi@danielrachi.com> Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

Use the recently published `context_deserialize` and remove it from Lighthouse Co-Authored-By: Mac L <mjladson@pm.me> Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

…ckerHub (sigp#7614) This pull request introduces workflows and updates to ensure reproducible builds for the Lighthouse project. It adds two GitHub Actions workflows for building and testing reproducible Docker images and binaries, updates the `Makefile` to streamline reproducible build configurations, and modifies the `Dockerfile.reproducible` to align with the new build process. Additionally, it removes the `reproducible` profile from `Cargo.toml`. ### New GitHub Actions Workflows: * [`.github/workflows/docker-reproducible.yml`](diffhunk://#diff-222af23bee616920b04f5b92a83eb5106fce08abd885cd3a3b15b8beb5e789c3R1-R145): Adds a workflow to build and push reproducible multi-architecture Docker images for releases, including support for dry runs without pushing an image. ### Build Configuration Updates: * [`Makefile`](diffhunk://#diff-76ed074a9305c04054cdebb9e9aad2d818052b07091de1f20cad0bbac34ffb52L85-R143): Refactors reproducible build targets, centralizes environment variables for reproducibility, and updates Docker build arguments for `x86_64` and `aarch64` architectures. * [`Dockerfile.reproducible`](diffhunk://#diff-587298ff141278ce3be7c54a559f9f31472cc5b384e285e2105b3dee319ba31dL1-R24): Updates the base Rust image to version 1.86, removes hardcoded reproducibility settings, and delegates build logic to the `Makefile`. * Switch to using jemalloc-sys from Debian repos instead of building it from source. A Debian version is [reproducible](https://tests.reproducible-builds.org/debian/rb-pkg/trixie/amd64/jemalloc.html) which is [hard to achieve](NixOS/nixpkgs#380852) if you build it from source. ### Profile Removal: * [`Cargo.toml`](diffhunk://#diff-2e9d962a08321605940b5a657135052fbcef87b5e360662bb527c96d9a615542L289-L295): Removes the `reproducible` profile, simplifying build configurations and relying on external tooling for reproducibility. Co-Authored-By: Moe Mahhouk <mohammed-mahhouk@hotmail.com> Co-Authored-By: chonghe <44791194+chong-he@users.noreply.github.com> Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>

Co-Authored-By: sashass1315 <sashass1315@gmail.com> Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>

…#8451) Instrument beacon node startup and parallelise pubkey cache initialisation. I instrumented beacon node startup and noticed that pubkey cache takes a long time to initialise, mostly due to decompressing all the validator pubkeys. This PR uses rayon to parallelize the decompression on initial checkpoint sync. The pubkeys are stored uncompressed, so the decopression time is not a problem on subsequent restarts. On restarts, we still deserialize pubkeys, but the timing is quite minimal on Sepolia so I didn't investigate further. `validator_pubkey_cache_new` timing on Sepolia: * before: 109.64ms * with parallelization: 21ms on Hoodi: * before: times out with Kurtosis after 120s * with parallelization: 12.77s to import keys **UPDATE**: downloading checkpoint state + genesis state takes about 2 minutes on my laptop, so it seems like the BN managed to start the http server just before timing out (after the optimisation). <img width="1380" height="625" alt="image" src="https://github.com/user-attachments/assets/4c548c14-57dd-4b47-af9a-115b15791940" /> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>

@jimmygchen

sigp#7727 introduced a bug in the logging, where as long as the node failed the SSZ `get_validator_blocks_v3` endpoint, it would log as `Beacon node does not support...`. However, the failure can be due to other reasons, such as a timed out error as found by @jimmygchen: `WARN Beacon node does not support SSZ in block production, falling back to JSON slot: 5283379, error: HttpClient(url: https://ho-h-bn-cowl.spesi.io:15052/, kind: timeout, detail: operation timed out` This PR made the error log more generic, so there is less confusion. Additionally, suggested by @michaelsproul, this PR refactors the `get_validator_blocks_v3` calls by trying all beacon nodes using the SSZ endpoint first, and if all beacon node fails the SSZ endpoint, only then fallback to JSON. It changes the logic from: "SSZ -> JSON for primary beacon node, followed by SSZ -> JSON for second beacon node and so on" to "SSZ for all beacon nodes -> JSON for all beacon nodes" This has the advantage that if the primary beacon node is having issues and failed the SSZ, we avoid retrying the primary beacon node again on JSON (as it could be that the primary beacon node fail again); rather, we switch to the second beacon node. Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io> Co-Authored-By: chonghe <44791194+chong-he@users.noreply.github.com>

kevaundray added 30 commits October 16, 2025 18:54

initial commit

c2a3683

add consensus types

f9f4517

add config

accc829

add proof cache

7694d9e

add proof gen/verify traits

3a37238

add registry proof gen/verification

7368ba2

add execution proofs to da_checker

ba14574

commit execution layer stub

ec773ce

add execution proof boundaries similar to columns

971c743

set min_execution_proof via the BeaconChain

86d0ffc

set the zkvm config values from the cli

26e13a1

Use ExecutionProofId since we no longer have multiple subnets

c5afb78

add slot to ExecutionProof and refactor to remove ExecutionProofSubnetId

984a1e0

refactor

6dfa83b

modify chain_spec

6d6b994

refactor for ExecutionProof

94027ce

add registry_proof_gen

2a18784

refactor proof_gen

d50c679

refactor proof verification

adb890d

commit proof_cache changes -- todo:remove this

958d360

simplify engine_api struct

b7c3538

fixup renaming changes

e1a0905

add proof generation service

23a8a59

Add execution proof observation and verification

fda4795

Add ExecutionProofByRoots RPC layer

b0f81ed

commit debug tests

665b485

remove debug tests

9d5aacc

commit GossipTopics/Discovery/PeerManagement

4b9802e

Add proof available checking logic and has_execution_layer_and_proof_…

1bcb674

…gen flag

fix missing topic

1db670c

chong-he and others added 29 commits November 12, 2025 05:26

Add optimistic_sync metric (sigp#8059)

d54dc68

* sigp#7974 Display `0` if optimistic sync and `1` for normal sync Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io> Co-Authored-By: chonghe <44791194+chong-he@users.noreply.github.com>

Add extra data in /eth/v1/debug/fork_choice (sigp#7845)

b5260db

* sigp#7829 Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io> Co-Authored-By: chonghe <44791194+chong-he@users.noreply.github.com>

re-targeting of remove-windows-ci against release-v8.0 (sigp#8406)

47b984e

Co-Authored-By: antondlr <anton@sigmaprime.io>

Compute missing_columns correctly (sigp#8425)

02d0c6a

N/A The difference is computed by taking the difference of expected with received. We were doing the inverse. Thanks to Yassine for finding the issue. Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>

Release v8.0.1 (sigp#8414)

ced49dd

This hot fix release includes the following fixes: * sigp#8388 * sigp#8406 * sigp#8391 * sigp#8413 Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Merge remote-tracking branch 'origin/stable' into unstable

261322c

Cargo Update (sigp#8443)

2ba8a8e

Co-Authored-By: Age Manning <Age@AgeManning.com>

Consolidate reqwest versions (sigp#8452)

bdfade8

Update `reqwest` to 0.12 so we only depend on a single version. This should slightly improve compile times and reduce binary bloat. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

Update docs on Siren port and other small updates (sigp#8399)

4494b0a

Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io>

Remove quickcheck in favour of proptest (sigp#8471)

070e395

Consolidate our property-testing around `proptest`. This PR was written with Copilot and manually tweaked. Co-Authored-By: Michael Sproul <michael@sproul.xyz> Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

Remove context_deserialize and import from crates.io (sigp#8172)

847fa3f

Use the recently published `context_deserialize` and remove it from Lighthouse Co-Authored-By: Mac L <mjladson@pm.me> Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

fix: compare bls changes in op-pool (sigp#8465)

9394663

Co-Authored-By: sashass1315 <sashass1315@gmail.com> Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>

Add tracing spans to validator client duty cycles (sigp#8482)

64031b6

Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>

Merge branch 'unstable' into kw/sel-alternative

a1ad44c

kevaundray force-pushed the kw/sel-alternative branch from 32b5a06 to a1ad44c Compare December 1, 2025 21:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: Optional proofs #20

chore: Optional proofs #20

Uh oh!

kevaundray commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

chore: Optional proofs #20

Are you sure you want to change the base?

chore: Optional proofs #20

Uh oh!

Conversation

kevaundray commented Nov 11, 2025

Issue Addressed

Proposed Changes

Additional Info

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants