fix: add update fallback propagation #2063

sanity · 2025-11-07T17:50:17Z

Summary

add fallback target computation for UPDATE propagation
fetch contracts locally when subscribers missing
improve wasm executor logging and store error reporting

Testing

cargo test operations::update
cargo test wasm_runtime::tests::contract_metering
cargo test --test operations update_propagates_fallback

sanity · 2025-11-07T22:52:05Z

@claude Please review PR

claude · 2025-11-07T22:52:24Z

Claude finished @sanity's task —— View job

PR Review: fix: add update fallback propagation

I've reviewed the changes in this PR. Here's my comprehensive analysis:

Summary of Changes

This PR implements three main improvements:

Fallback target computation for UPDATE propagation - Ensures UPDATEs can propagate even when no subscribers exist
Contract fetching during SeekNode fallback - Allows peers to fetch contracts locally when forwarding fails
Improved logging - Better diagnostics for UPDATE operations and store errors

Detailed Review by File

`crates/core/src/operations/update.rs` (+218/-16)

Strengths:

Fallback propagation logic (lines 252-258, 429-435, 621-628): The new compute_update_fallback_targets method provides a good safety net when subscriber lists are empty. Uses k_closest_potentially_caching with k=3, which is reasonable.
Contract fetching fallback (lines 494-587): Excellent recovery mechanism when SeekNode can't find a forwarding target. Attempts to fetch the contract via GET operation and apply the update locally.
Enhanced error logging (lines 336-348): The detailed error logging with subscriber lists, candidates, and connection count will be very helpful for debugging network issues.
Self-subscription filtering (lines 787-816): The logic to allow the sender to remain in the broadcast list when appropriate (for local auto-subscribe) is well-considered.

Concerns & Questions:

Potential infinite recursion (lines 503-514): When SeekNode fails to find a forwarding target, it triggers a GET operation which might itself fail with NoCachingPeers. Is there a risk of circular dependencies or retry storms here?
```
let get_op = get::start_op(*key, true, false);
if let Err(fetch_err) = get::request_get(op_manager, get_op, fetch_skip).await {
    // Falls through to error - but could GET also trigger an UPDATE?
```
Fallback target selection (lines 852-884): The fallback uses k_closest_potentially_caching with k=3. Questions:
- Why k=3 specifically? This seems arbitrary and isn't documented.
- Should this be configurable or proportional to network size?
- What happens in very small networks (< 3 peers)?

Duplicat filtering logic (lines 787-816): The double filtering for allow_self seems complex and potentially error-prone:

.filter(|pk| {
    if &pk.peer == sender {
        allow_self
    } else {
        true
    }
})
.filter(|pk| {
    if let Some(self_peer) = &self_peer {
        if &pk.peer == self_peer {
            allow_self
        } else {
            true
        }
    } else {
        true
    }
})

This could be simplified to a single filter with clearer logic.

Missing validation (line 516): After fetching the contract in the SeekNode fallback path, there's a check if super::has_contract(op_manager, *key).await? but the GET operation already succeeded. Is this redundant? Or is there a race condition concern?
Error handling inconsistency: The SeekNode fallback path changes an error to a warning (line 494: tracing::warn! instead of tracing::error!) and then attempts recovery. This is good, but should be consistent - why does the RequestUpdate path (line 356) still use tracing::error! without attempting the same recovery?

Recommendations:

Add comments explaining why k=3 was chosen for fallback targets
Simplify the double-filter logic for self-subscription
Consider unifying the recovery strategy across RequestUpdate and SeekNode message types
Add a circuit breaker or max-retry limit to prevent potential retry storms in the GET fallback path

`crates/core/src/operations/put.rs` (+16/-0)

Strengths:

Good addition of trace-level logging for PUT operation state tracking
Helps with debugging upstream forwarding behavior

Concerns:
None significant - these are pure logging additions.

`crates/core/src/contract/executor/runtime.rs` (+5/-0)

Strengths:

Useful info-level logging when contract state is updated

Minor suggestion:

Consider adding the previous state size to make the log more informative:

tracing::info!(
    "Contract state updated for {key}, prev_size={}, new_size={}",
    current_state.as_ref().len(),
    new_state.as_ref().len()
);

`crates/core/src/wasm_runtime/store.rs` (+10/-9)

Strengths:

Appropriate replacement of eprintln! with tracing::error! for better observability
Maintains error context with file/line information

Concerns:
None - this is a straightforward logging improvement.

Test files (contract_metering.rs, mod.rs)

Strengths:

Replacing println! with info! is correct for production code

Concerns:
None - these are test-only changes.

Testing Recommendations

Based on the PR description, the following tests are mentioned:

cargo test operations::update
cargo test wasm_runtime::tests::contract_metering
cargo test --test operations update_propagates_fallback

Additional test scenarios to consider:

Test UPDATE propagation in a network with no subscribers but with peers (fallback scenario)
Test the SeekNode contract fetch fallback path with various network configurations
Test self-subscription behavior (sender should receive updates when appropriate)
Test UPDATE behavior in very small networks (1-2 peers)
Edge case: UPDATE arrives while contract is still initializing

Overall Assessment

Positive aspects:

Well-structured fallback logic that improves UPDATE reliability
Thoughtful consideration of edge cases (no subscribers, contract not found)
Excellent diagnostic logging improvements
The contract fetching fallback in SeekNode is particularly valuable

Areas for improvement:

Documentation of magic numbers (k=3)
Potential simplification of filtering logic
Clarification of retry/recursion behavior in fallback paths
Consider consistency in error recovery strategies across message types

Recommendation: This PR significantly improves UPDATE propagation reliability and observability. The changes are generally well-implemented. Address the concerns around potential infinite recursion, document the k=3 choice, and consider simplifying the filtering logic before merging.

sanity · 2025-11-08T01:59:11Z

Addressed Claude's review points:

Added UPDATE_FALLBACK_TARGET_LIMIT so the fallback fan-out is documented and easy to tune.
When RequestUpdate runs out of forward targets we now attempt the same GET fallback as SeekNode, including skipping ourselves in the GET skip-list and reusing the broadcast logic once the contract is fetched.
Simplified the allow-self filtering in get_broadcast_targets_update and documented why we double-check has_contract after a GET completes.

Tests: cargo test -p freenet operations::update

iduartgomez · 2025-11-09T12:40:53Z

crates/core/src/operations/update.rs

+                                .ring
+                                .closest_potentially_caching(key, skip_peers.as_slice())
+                                .or_else(|| {
+                                    op_manager


why does this make a difference? if we get no closest_potentially_caching why we would get anything calling call k_closest_potentially_caching?

iduartgomez · 2025-11-09T12:46:17Z

What is the end goal of this PR? If we are pushing an update to a peer which is not subscribed to it what will happen? Won't the update be ignored anyway? Are we caching preemptively on that peer receiving the update?

Is the goal so update op does not return an error so the client applications will be receiving a positive result for the update? If so, this seems a very complicated way of achieving that, just mark the op as completed and return a message so it will be picked by the client apps waiting for responses and be done with it.

sanity · 2025-11-09T16:32:54Z

[Codex] @iduartgomez you were right to be skeptical about the GET-based fallback. I reverted those changes so UPDATE now behaves like the rest of the routing ops again: if we cannot reach a peer that already has the contract, we surface RingError::NoCachingPeers instead of caching it locally. While touching the file I kept the richer diagnostics (subscriber/candidate logging and the allow-self filter) so we still gain better observability without changing semantics.

Could you take another look at this simplified version and confirm it matches the original intent? Tests: cargo test -p freenet operations::update.

iduartgomez · 2025-11-09T17:27:23Z

crates/core/src/client_events/mod.rs

Can you explain why order was changed here? What is fixed this way?

iduartgomez · 2025-11-09T17:28:54Z

let's rebase on main so we only see this pr changes

iduartgomez · 2025-11-09T17:34:42Z

@claude try to rebase on main, prefer main changes

claude · 2025-11-09T17:35:02Z

Claude finished @iduartgomez's task —— View job

Rebasing PR onto main

I'm unable to complete this request due to permission limitations.

What I tried

Check git status
Attempt to rebase onto main

Issue encountered

Git rebase operations require additional approval that I don't currently have access to. Additionally, rebasing typically requires a force-push to update the remote branch, which is outside my capabilities for security reasons.

How to proceed

You'll need to perform the rebase manually. Here are the commands:

# Fetch the latest main
git fetch origin main

# Rebase onto main, preferring main's changes in conflicts
git rebase origin/main -X theirs

# If there are conflicts that can't be auto-resolved:
# - Resolve them manually
# - Use: git add <resolved-files>
# - Continue with: git rebase --continue

# Force push the rebased branch (required after rebase)
git push origin stack/update-fallback --force-with-lease

The --force-with-lease flag is safer than --force as it will prevent overwriting work if someone else has pushed to the branch.

Alternative approach

If you prefer not to rebase, you could merge main into this branch instead:

git fetch origin main
git merge origin/main -X theirs
git push origin stack/update-fallback

This wouldn't require a force-push and I could help with that if you'd like.

sanity requested a review from iduartgomez November 8, 2025 01:40

iduartgomez reviewed Nov 9, 2025

View reviewed changes

Base automatically changed from stack/session-actor-race to main November 9, 2025 12:55

iduartgomez reviewed Nov 9, 2025

View reviewed changes

crates/core/src/client_events/mod.rs

Copy link

Collaborator

iduartgomez Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why order was changed here? What is fixed this way?

sanity added 4 commits November 9, 2025 19:35

fix: add update fallback propagation

66110e0

fix(update): keep local subscribers and add PUT ack tracing

da01c02

fix(update): remove GET fallback propagation

4a69387

style(update): use direct equality in allow-self check

e161690

iduartgomez force-pushed the stack/update-fallback branch from c3fdf36 to e161690 Compare November 9, 2025 18:36

iduartgomez approved these changes Nov 9, 2025

View reviewed changes

iduartgomez added this pull request to the merge queue Nov 9, 2025

Merged via the queue into main with commit a413514 Nov 9, 2025
11 checks passed

iduartgomez deleted the stack/update-fallback branch November 9, 2025 19:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: add update fallback propagation #2063

fix: add update fallback propagation #2063

Uh oh!

sanity commented Nov 7, 2025

Uh oh!

sanity commented Nov 7, 2025

Uh oh!

claude bot commented Nov 7, 2025 •

edited

Loading

Uh oh!

sanity commented Nov 8, 2025

Uh oh!

iduartgomez Nov 9, 2025

Uh oh!

iduartgomez commented Nov 9, 2025

Uh oh!

sanity commented Nov 9, 2025

Uh oh!

iduartgomez Nov 9, 2025

Uh oh!

iduartgomez commented Nov 9, 2025

Uh oh!

iduartgomez commented Nov 9, 2025

Uh oh!

claude bot commented Nov 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

fix: add update fallback propagation #2063

fix: add update fallback propagation #2063

Uh oh!

Conversation

sanity commented Nov 7, 2025

Summary

Testing

Uh oh!

sanity commented Nov 7, 2025

Uh oh!

claude bot commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: fix: add update fallback propagation

Summary of Changes

Detailed Review by File

crates/core/src/operations/update.rs (+218/-16)

crates/core/src/operations/put.rs (+16/-0)

crates/core/src/contract/executor/runtime.rs (+5/-0)

crates/core/src/wasm_runtime/store.rs (+10/-9)

Test files (contract_metering.rs, mod.rs)

Testing Recommendations

Overall Assessment

Uh oh!

sanity commented Nov 8, 2025

Uh oh!

iduartgomez Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

iduartgomez commented Nov 9, 2025

Uh oh!

sanity commented Nov 9, 2025

Uh oh!

iduartgomez Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

iduartgomez commented Nov 9, 2025

Uh oh!

iduartgomez commented Nov 9, 2025

Uh oh!

claude bot commented Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rebasing PR onto main

What I tried

Issue encountered

How to proceed

Alternative approach

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

claude bot commented Nov 7, 2025 •

edited

Loading

`crates/core/src/operations/update.rs` (+218/-16)

`crates/core/src/operations/put.rs` (+16/-0)

`crates/core/src/contract/executor/runtime.rs` (+5/-0)

`crates/core/src/wasm_runtime/store.rs` (+10/-9)

claude bot commented Nov 9, 2025 •

edited

Loading