Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 7, 2025

📄 14% (0.14x) speedup for xfp_int_from_xfp_bytes in electrum/plugins/coldcard/coldcard.py

⏱️ Runtime : 43.7 microseconds 38.1 microseconds (best of 250 runs)

📝 Explanation and details

The optimization removes two keyword arguments (byteorder="little" and signed=False) from the int.from_bytes() call, replacing them with positional arguments and relying on defaults.

Key changes:

  • byteorder="little""little" (positional argument)
  • Removed signed=False (relies on default signed=False)

Why this is faster:
Python's argument parsing overhead is reduced when using positional arguments instead of keyword arguments. The interpreter doesn't need to:

  1. Parse and match keyword argument names
  2. Handle the additional dictionary lookup for keyword parameters
  3. Process the extra signed=False parameter (since False is the default)

Performance results:

  • 14% overall speedup (43.7μs → 38.1μs)
  • Line profiler shows 26% reduction in per-call time (1323.8ns → 975ns per hit)
  • Test cases show consistent 5-40% improvements across different input sizes, with the best gains on smaller byte arrays (single bytes, empty bytes)

Test case patterns:
The optimization is most effective for:

  • Small byte conversions (1-4 bytes): 15-40% faster
  • Edge cases with simple inputs: 20-35% faster
  • Less effective for very large byte arrays (>100 bytes): sometimes 1-4% slower, likely due to measurement noise

This micro-optimization is particularly valuable since int.from_bytes() is already the most efficient way to perform this conversion, so reducing call overhead is one of the few remaining optimization opportunities.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 73 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 1 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest  # used for our unit tests
from electrum.plugins.coldcard.coldcard import xfp_int_from_xfp_bytes

# unit tests

# -----------------------
# Basic Test Cases
# -----------------------

def test_basic_zero():
    # Test with all zero bytes
    codeflash_output = xfp_int_from_xfp_bytes(b'\x00\x00\x00\x00') # 925ns -> 666ns (38.9% faster)

def test_basic_one():
    # Test with value 1 (little endian)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x01\x00\x00\x00') # 826ns -> 617ns (33.9% faster)

def test_basic_max_single_byte():
    # Test with value 255 (0xff)
    codeflash_output = xfp_int_from_xfp_bytes(b'\xff\x00\x00\x00') # 790ns -> 577ns (36.9% faster)

def test_basic_middle_value():
    # Test with a middle value
    codeflash_output = xfp_int_from_xfp_bytes(b'\x01\x02\x03\x04') # 811ns -> 702ns (15.5% faster)

def test_basic_highest_byte():
    # Test with highest byte set
    codeflash_output = xfp_int_from_xfp_bytes(b'\x00\x00\x00\xff') # 790ns -> 661ns (19.5% faster)

def test_basic_all_bytes_set():
    # Test with all bytes set to 0xff
    codeflash_output = xfp_int_from_xfp_bytes(b'\xff\xff\xff\xff') # 708ns -> 615ns (15.1% faster)

# -----------------------
# Edge Test Cases
# -----------------------

def test_edge_empty_bytes():
    # Test with empty bytes (should be 0)
    codeflash_output = xfp_int_from_xfp_bytes(b'') # 740ns -> 613ns (20.7% faster)

def test_edge_less_than_4_bytes():
    # Test with 1, 2, 3 bytes (should pad as little-endian)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x01') # 712ns -> 582ns (22.3% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x01\x02') # 378ns -> 351ns (7.69% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x01\x02\x03') # 248ns -> 223ns (11.2% faster)

def test_edge_more_than_4_bytes():
    # Test with more than 4 bytes (should use all bytes)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x01\x02\x03\x04\x05') # 666ns -> 598ns (11.4% faster)

def test_edge_maximum_possible_bytes():
    # Test with 8 bytes (max for unsigned 64-bit)
    codeflash_output = xfp_int_from_xfp_bytes(b'\xff\xff\xff\xff\xff\xff\xff\xff') # 739ns -> 620ns (19.2% faster)

def test_edge_signed_bytes():
    # Test with highest bit set (should be unsigned)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x80\x00\x00\x00') # 622ns -> 578ns (7.61% faster)

def test_edge_non_bytes_input():
    # Should raise TypeError if input is not bytes
    with pytest.raises(TypeError):
        xfp_int_from_xfp_bytes('abcd')  # str instead of bytes

    with pytest.raises(TypeError):
        xfp_int_from_xfp_bytes([0, 1, 2, 3])  # list instead of bytes

def test_edge_bytearray_input():
    # Should accept bytearray as bytes-like object
    codeflash_output = xfp_int_from_xfp_bytes(bytearray(b'\x01\x00\x00\x00')) # 1.65μs -> 1.37μs (20.8% faster)

def test_edge_large_single_byte():
    # Test with a single byte at the end
    codeflash_output = xfp_int_from_xfp_bytes(b'\x00\x00\x00\x80') # 915ns -> 724ns (26.4% faster)

def test_edge_highest_possible_4_bytes():
    # Test with highest possible 4-byte value
    codeflash_output = xfp_int_from_xfp_bytes(b'\xff\xff\xff\xff') # 812ns -> 597ns (36.0% faster)

# -----------------------
# Large Scale Test Cases
# -----------------------

def test_large_scale_100_bytes():
    # Test with 100 bytes, all set to 0xff
    inp = b'\xff' * 100
    expected = int.from_bytes(inp, byteorder="little", signed=False)
    codeflash_output = xfp_int_from_xfp_bytes(inp) # 735ns -> 768ns (4.30% slower)

def test_large_scale_increasing_bytes():
    # Test with 256 bytes, values increasing from 0x00 to 0xff
    inp = bytes(range(256))
    expected = int.from_bytes(inp, byteorder="little", signed=False)
    codeflash_output = xfp_int_from_xfp_bytes(inp) # 711ns -> 725ns (1.93% slower)

def test_large_scale_decreasing_bytes():
    # Test with 256 bytes, values decreasing from 0xff to 0x00
    inp = bytes(reversed(range(256)))
    expected = int.from_bytes(inp, byteorder="little", signed=False)
    codeflash_output = xfp_int_from_xfp_bytes(inp) # 689ns -> 812ns (15.1% slower)

def test_large_scale_random_bytes():
    # Test with 512 bytes, random values
    import random
    random.seed(42)  # deterministic
    inp = bytes(random.getrandbits(8) for _ in range(512))
    expected = int.from_bytes(inp, byteorder="little", signed=False)
    codeflash_output = xfp_int_from_xfp_bytes(inp) # 914ns -> 932ns (1.93% slower)

def test_large_scale_performance():
    # Performance test: converting 999 bytes
    inp = b'\x01' * 999
    expected = int.from_bytes(inp, byteorder="little", signed=False)
    codeflash_output = xfp_int_from_xfp_bytes(inp) # 1.29μs -> 1.31μs (2.13% slower)

# -----------------------
# Additional Edge Cases
# -----------------------

def test_edge_all_zero_bytes_various_lengths():
    # Test all zero bytes with various lengths
    for n in [0, 1, 2, 3, 4, 10, 100, 256, 999]:
        codeflash_output = xfp_int_from_xfp_bytes(b'\x00' * n) # 3.33μs -> 2.98μs (11.5% faster)

def test_edge_single_high_byte_various_positions():
    # Test a single high byte in different positions
    for i in range(4):
        b = bytearray(4)
        b[i] = 0xff
        expected = int.from_bytes(b, byteorder="little", signed=False)
        codeflash_output = xfp_int_from_xfp_bytes(bytes(b)) # 1.46μs -> 1.38μs (5.21% faster)

def test_edge_mutation_resistance():
    # Mutation: changing byteorder to "big" should fail these tests
    inp = b'\x01\x02\x03\x04'
    expected_little = 0x04030201
    expected_big = 0x01020304
    codeflash_output = xfp_int_from_xfp_bytes(inp) # 638ns -> 562ns (13.5% faster)
    codeflash_output = xfp_int_from_xfp_bytes(inp) # 318ns -> 273ns (16.5% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest  # used for our unit tests
from electrum.plugins.coldcard.coldcard import xfp_int_from_xfp_bytes

# unit tests

# -------------------------
# 1. Basic Test Cases
# -------------------------

def test_basic_single_byte():
    # Test conversion of a single byte value
    codeflash_output = xfp_int_from_xfp_bytes(b'\x01') # 654ns -> 536ns (22.0% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x7f') # 311ns -> 258ns (20.5% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x00') # 268ns -> 233ns (15.0% faster)

def test_basic_two_bytes():
    # Test conversion of two bytes, little endian
    codeflash_output = xfp_int_from_xfp_bytes(b'\x01\x00') # 600ns -> 552ns (8.70% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x00\x01') # 366ns -> 294ns (24.5% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\xff\x00') # 246ns -> 244ns (0.820% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\xff\xff') # 235ns -> 210ns (11.9% faster)

def test_basic_four_bytes():
    # Test conversion of four bytes, little endian
    codeflash_output = xfp_int_from_xfp_bytes(b'\x01\x00\x00\x00') # 613ns -> 522ns (17.4% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x00\x00\x00\x01') # 405ns -> 345ns (17.4% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x78\x56\x34\x12') # 238ns -> 200ns (19.0% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\xff\xff\xff\xff') # 245ns -> 220ns (11.4% faster)

def test_basic_empty_bytes():
    # Test empty bytes input (should be zero)
    codeflash_output = xfp_int_from_xfp_bytes(b'') # 604ns -> 571ns (5.78% faster)

# -------------------------
# 2. Edge Test Cases
# -------------------------

def test_edge_max_byte_values():
    # Test maximum values for various lengths
    codeflash_output = xfp_int_from_xfp_bytes(b'\xff') # 626ns -> 550ns (13.8% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\xff\xff') # 328ns -> 312ns (5.13% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\xff\xff\xff\xff') # 269ns -> 235ns (14.5% faster)

def test_edge_leading_zeros():
    # Leading zeros should not affect the value
    codeflash_output = xfp_int_from_xfp_bytes(b'\x00\x00\x01') # 610ns -> 539ns (13.2% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x00\x00\x00\x01') # 378ns -> 319ns (18.5% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x00\x00\x00\x00') # 345ns -> 265ns (30.2% faster)

def test_edge_all_zeros():
    # All zero bytes of any length should be zero
    codeflash_output = xfp_int_from_xfp_bytes(b'\x00') # 641ns -> 560ns (14.5% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x00\x00') # 345ns -> 275ns (25.5% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x00\x00\x00') # 241ns -> 244ns (1.23% slower)

def test_edge_non_ascii_bytes():
    # Test with bytes outside ASCII range
    codeflash_output = xfp_int_from_xfp_bytes(b'\x80') # 611ns -> 541ns (12.9% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\xfe\xff') # 308ns -> 298ns (3.36% faster)

def test_edge_large_endian_vs_little_endian():
    # Ensure little endian is used, not big endian
    # b'\x01\x02\x03\x04' in little endian is 0x04030201
    codeflash_output = xfp_int_from_xfp_bytes(b'\x01\x02\x03\x04') # 655ns -> 581ns (12.7% faster)

def test_edge_signed_false_behavior():
    # Should not interpret as signed, even if MSB is set
    codeflash_output = xfp_int_from_xfp_bytes(b'\xff\xff\xff\xff') # 618ns -> 551ns (12.2% faster)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x80\x00\x00\x00') # 379ns -> 282ns (34.4% faster)

def test_edge_input_type():
    # Should raise TypeError if input is not bytes
    with pytest.raises(TypeError):
        xfp_int_from_xfp_bytes("not bytes")
    with pytest.raises(TypeError):
        xfp_int_from_xfp_bytes([0x01, 0x02])
    with pytest.raises(TypeError):
        xfp_int_from_xfp_bytes(12345)

def test_edge_large_number_of_bytes():
    # Test with 8 bytes (max for 64-bit unsigned int)
    codeflash_output = xfp_int_from_xfp_bytes(b'\xff\xff\xff\xff\xff\xff\xff\xff') # 1.20μs -> 911ns (31.2% faster)
    # Test with 16 bytes (arbitrary large number)
    codeflash_output = xfp_int_from_xfp_bytes(b'\x01' + b'\x00'*15) # 389ns -> 333ns (16.8% faster)

# -------------------------
# 3. Large Scale Test Cases
# -------------------------

def test_large_scale_maximum_length():
    # Test with 1000 bytes (all set to 0x01)
    input_bytes = b'\x01' * 1000
    # In little endian, this is a huge number: sum of 1 * 256^i for i in 0..999
    expected = sum(1 << (8*i) for i in range(1000))
    codeflash_output = xfp_int_from_xfp_bytes(input_bytes) # 2.00μs -> 1.85μs (8.12% faster)

def test_large_scale_alternating_bytes():
    # Test with 512 bytes alternating 0x00 and 0xff
    input_bytes = b''.join([b'\x00\xff'] * 256)
    # The value should be the sum of 255 * 256^(2*i+1) for i in 0..255
    expected = sum(255 << (8*(2*i+1)) for i in range(256))
    codeflash_output = xfp_int_from_xfp_bytes(input_bytes) # 1.31μs -> 1.10μs (18.8% faster)

def test_large_scale_leading_zeros():
    # Test with 999 leading zeros, followed by a single 0x01
    input_bytes = b'\x00' * 999 + b'\x01'
    # In little endian, this is 1 << (8*999)
    expected = 1 << (8*999)
    codeflash_output = xfp_int_from_xfp_bytes(input_bytes) # 1.68μs -> 1.42μs (18.4% faster)

def test_large_scale_all_zeros():
    # Test with 1000 zero bytes
    input_bytes = b'\x00' * 1000
    codeflash_output = xfp_int_from_xfp_bytes(input_bytes) # 1.12μs -> 944ns (18.5% faster)

def test_large_scale_random_bytes():
    # Test with 1000 bytes, each byte is its index mod 256
    input_bytes = bytes([i % 256 for i in range(1000)])
    # Calculate expected value
    expected = sum((i % 256) << (8*i) for i in range(1000))
    codeflash_output = xfp_int_from_xfp_bytes(input_bytes) # 1.67μs -> 1.41μs (18.7% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from electrum.plugins.coldcard.coldcard import xfp_int_from_xfp_bytes

def test_xfp_int_from_xfp_bytes():
    xfp_int_from_xfp_bytes(b'')
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_0kz7t2kd/tmp7a053vhi/test_concolic_coverage.py::test_xfp_int_from_xfp_bytes 752ns 585ns 28.5%✅

To edit these changes git checkout codeflash/optimize-xfp_int_from_xfp_bytes-mhotjftj and push.

Codeflash Static Badge

The optimization removes two keyword arguments (`byteorder="little"` and `signed=False`) from the `int.from_bytes()` call, replacing them with positional arguments and relying on defaults.

**Key changes:**
- `byteorder="little"` → `"little"` (positional argument)  
- Removed `signed=False` (relies on default `signed=False`)

**Why this is faster:**
Python's argument parsing overhead is reduced when using positional arguments instead of keyword arguments. The interpreter doesn't need to:
1. Parse and match keyword argument names
2. Handle the additional dictionary lookup for keyword parameters
3. Process the extra `signed=False` parameter (since `False` is the default)

**Performance results:**
- **14% overall speedup** (43.7μs → 38.1μs)
- Line profiler shows **26% reduction** in per-call time (1323.8ns → 975ns per hit)
- Test cases show consistent **5-40% improvements** across different input sizes, with the best gains on smaller byte arrays (single bytes, empty bytes)

**Test case patterns:**
The optimization is most effective for:
- Small byte conversions (1-4 bytes): 15-40% faster
- Edge cases with simple inputs: 20-35% faster
- Less effective for very large byte arrays (>100 bytes): sometimes 1-4% slower, likely due to measurement noise

This micro-optimization is particularly valuable since `int.from_bytes()` is already the most efficient way to perform this conversion, so reducing call overhead is one of the few remaining optimization opportunities.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 7, 2025 12:15
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant