Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 7, 2025

📄 79% (0.79x) speedup for Contacts.by_name in electrum/contacts.py

⏱️ Runtime : 760 microseconds 425 microseconds (best of 111 runs)

📝 Explanation and details

The optimization delivers a 79% speedup by eliminating redundant dictionary lookups and string operations in the by_name method:

Key optimizations:

  1. Eliminate repeated casefold() calls: The original code calls name.casefold() on every iteration. The optimized version calls it once upfront and stores the result in name_cf, avoiding thousands of redundant string transformations.

  2. Replace keys() + __getitem__ with items() + unpacking: The original iterates with for k in self.keys() then accesses values via self[k], requiring dictionary hash lookups for each entry. The optimized version uses for k, (_type, addr) in self.items() which directly unpacks the key-value pairs in a single iteration, eliminating the hash lookups.

Performance analysis from line profiler:

  • Total execution time reduced from 5.8ms to 3.85ms (33% improvement)
  • The loop iteration overhead (line with for) improved from 31.4% to 42.6% of total time, but with lower absolute time
  • The comparison line shows reduced per-hit time (350.4ns → 289.3ns per hit)

Test case benefits:

  • Small contact lists (1-10 entries): 15-47% speedup
  • Large contact lists (1000 entries): 75-87% speedup when searching later entries
  • The optimization scales particularly well with larger datasets since it eliminates O(n) dictionary lookups

This optimization maintains identical behavior and return values while significantly improving performance, especially beneficial for applications with frequent contact lookups or large contact databases.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 85 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest
from electrum.contacts import Contacts


# --- Minimal stubs for dependencies ---
class DummyLogger:
    def __init__(self):
        pass

class DummyWalletDB:
    """Minimal WalletDB stub for testing Contacts."""
    def __init__(self, contacts=None):
        self._contacts = contacts if contacts is not None else {}

    def get(self, key, default=None):
        if key == 'contacts':
            return self._contacts.copy()
        return default
from electrum.contacts import Contacts

# --- Unit tests for Contacts.by_name ---

# ---------- BASIC TEST CASES ----------
def test_by_name_returns_correct_contact():
    """Test that by_name returns the correct contact dictionary for a matching name."""
    contacts_data = {
        '1abc': ('address', 'Alice'),
        '1def': ('address', 'Bob'),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name('Alice'); result = codeflash_output # 1.51μs -> 1.16μs (30.5% faster)
    codeflash_output = contacts.by_name('Bob'); result = codeflash_output # 942ns -> 706ns (33.4% faster)

def test_by_name_case_insensitivity():
    """Test that by_name matches names case-insensitively."""
    contacts_data = {
        '1abc': ('address', 'Alice'),
        '1def': ('address', 'Bob'),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name('alice') # 1.40μs -> 1.12μs (24.6% faster)
    codeflash_output = contacts.by_name('ALICE') # 535ns -> 453ns (18.1% faster)
    codeflash_output = contacts.by_name('bob') # 771ns -> 665ns (15.9% faster)

def test_by_name_returns_none_for_missing_name():
    """Test that by_name returns None if the name is not in contacts."""
    contacts_data = {
        '1abc': ('address', 'Alice'),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name('Charlie') # 1.21μs -> 970ns (24.8% faster)

def test_by_name_with_multiple_contacts_same_name():
    """Test that by_name returns the first matching contact if multiple have the same name."""
    contacts_data = {
        '1abc': ('address', 'Alice'),
        '1def': ('address', 'Alice'),
        '1ghi': ('address', 'Bob'),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name('Alice'); result = codeflash_output # 1.37μs -> 1.10μs (24.6% faster)

def test_by_name_with_non_address_type():
    """Test that by_name works for contacts with types other than 'address'."""
    contacts_data = {
        '1abc': ('email', 'Alice'),
        '1def': ('phone', 'Bob'),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name('Alice') # 1.41μs -> 1.04μs (34.6% faster)
    codeflash_output = contacts.by_name('Bob') # 953ns -> 706ns (35.0% faster)

# ---------- EDGE TEST CASES ----------
def test_by_name_empty_contacts():
    """Test that by_name returns None when contacts are empty."""
    db = DummyWalletDB(contacts={})
    contacts = Contacts(db)
    codeflash_output = contacts.by_name('Alice') # 758ns -> 688ns (10.2% faster)

def test_by_name_with_empty_string_name():
    """Test that by_name returns None for empty string name."""
    contacts_data = {
        '1abc': ('address', 'Alice'),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name('') # 1.30μs -> 1.01μs (28.9% faster)

def test_by_name_with_whitespace_name():
    """Test that by_name returns None for whitespace-only name."""
    contacts_data = {
        '1abc': ('address', 'Alice'),
        '1def': ('address', ' '),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name(' ') # 1.76μs -> 1.31μs (34.4% faster)
    codeflash_output = contacts.by_name('   ') # 814ns -> 569ns (43.1% faster)



def test_by_name_with_special_characters():
    """Test that by_name matches names with special characters."""
    contacts_data = {
        '1abc': ('address', 'Al!ce'),
        '1def': ('address', 'B@b'),
        '1ghi': ('address', 'C#arl),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name('Al!ce') # 1.96μs -> 1.43μs (37.8% faster)
    codeflash_output = contacts.by_name('B@b') # 936ns -> 748ns (25.1% faster)
    codeflash_output = contacts.by_name('C#arl) # 863ns -> 656ns (31.6% faster)

def test_by_name_with_leading_trailing_spaces():
    """Test that by_name does not ignore leading/trailing spaces in names."""
    contacts_data = {
        '1abc': ('address', ' Alice '),
        '1def': ('address', 'Bob'),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name(' Alice ') # 1.59μs -> 1.21μs (31.6% faster)
    codeflash_output = contacts.by_name('Alice') # 817ns -> 659ns (24.0% faster)

def test_by_name_with_unicode_characters():
    """Test that by_name matches names with unicode characters."""
    contacts_data = {
        '1abc': ('address', 'Álice'),
        '1def': ('address', 'Боб'),
        '1ghi': ('address', '李雷'),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name('Álice') # 2.17μs -> 1.77μs (22.6% faster)
    codeflash_output = contacts.by_name('Боб') # 1.15μs -> 949ns (21.5% faster)
    codeflash_output = contacts.by_name('李雷') # 1.03μs -> 797ns (29.4% faster)

def test_by_name_with_mixed_case_unicode():
    """Test that by_name is case-insensitive for unicode."""
    contacts_data = {
        '1abc': ('address', 'álice'),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name('ÁLICE') # 1.71μs -> 1.36μs (25.7% faster)

# ---------- LARGE SCALE TEST CASES ----------
def test_by_name_large_number_of_contacts():
    """Test that by_name works efficiently with a large number of contacts."""
    # Create 1000 contacts with unique names
    contacts_data = {f'addr{i}': ('address', f'Name{i}') for i in range(1000)}
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    # Check first, middle, and last
    codeflash_output = contacts.by_name('Name0') # 2.09μs -> 1.43μs (46.7% faster)
    codeflash_output = contacts.by_name('Name500') # 48.1μs -> 26.5μs (81.5% faster)
    codeflash_output = contacts.by_name('Name999') # 97.4μs -> 52.3μs (86.4% faster)
    # Check that a non-existent name returns None
    codeflash_output = contacts.by_name('Name1000') # 94.8μs -> 50.5μs (87.8% faster)

def test_by_name_large_number_of_duplicate_names():
    """Test that by_name returns one of the contacts when many have the same name."""
    # 1000 contacts, all with name 'Alice'
    contacts_data = {f'addr{i}': ('address', 'Alice') for i in range(1000)}
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name('Alice'); result = codeflash_output # 1.78μs -> 1.26μs (40.8% faster)

def test_by_name_performance_large_contacts(monkeypatch):
    """Performance test: by_name should not take excessive time for 1000 contacts."""
    import time
    contacts_data = {f'addr{i}': ('address', f'Name{i}') for i in range(1000)}
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    start = time.time()
    # This should be fast (<0.1s on typical hardware)
    codeflash_output = contacts.by_name('Name999'); result = codeflash_output # 99.6μs -> 53.3μs (86.9% faster)
    elapsed = time.time() - start

def test_by_name_with_varied_types_large_scale():
    """Test by_name with 1000 contacts of varied types."""
    contacts_data = {}
    for i in range(500):
        contacts_data[f'addrA{i}'] = ('address', f'User{i}')
        contacts_data[f'addrE{i}'] = ('email', f'User{i}')
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    # Should find the first matching name, regardless of type
    codeflash_output = contacts.by_name('User123'); result = codeflash_output # 24.5μs -> 13.9μs (75.5% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest
from electrum.contacts import Contacts

# --- Minimal stubs for dependencies to allow Contacts to work in tests ---

class DummyBitcoin:
    @staticmethod
    def is_address(addr):
        # Accepts addresses that start with '1', '3', or 'bc1', and are at least 26 chars
        return isinstance(addr, str) and (
            addr.startswith("1") or addr.startswith("3") or addr.startswith("bc1")
        ) and len(addr) >= 26

bitcoin = DummyBitcoin()

class Logger:
    def __init__(self):
        pass

class DummyWalletDB:
    def __init__(self, contacts=None):
        self._contacts = contacts if contacts is not None else {}

    def get(self, key, default=None):
        if key == 'contacts':
            return self._contacts.copy()
        return default
from electrum.contacts import Contacts

# --- Unit Tests ---

# BASIC TEST CASES

def test_by_name_basic_single_entry():
    # Test with a single contact
    contacts_data = {
        "1BitcoinAddress000000000000000000": ("address", "Alice")
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name("Alice"); result = codeflash_output # 1.74μs -> 1.28μs (35.4% faster)

def test_by_name_basic_multiple_entries():
    # Test with multiple contacts
    contacts_data = {
        "1BitcoinAddress000000000000000001": ("address", "Alice"),
        "1BitcoinAddress000000000000000002": ("address", "Bob"),
        "1BitcoinAddress000000000000000003": ("address", "Charlie"),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name("Bob"); result = codeflash_output # 1.86μs -> 1.36μs (36.3% faster)

def test_by_name_basic_case_insensitive():
    # Test case-insensitive matching
    contacts_data = {
        "1BitcoinAddress000000000000000004": ("address", "Daisy"),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name("daisy"); result = codeflash_output # 1.58μs -> 1.21μs (30.8% faster)

def test_by_name_basic_not_found():
    # Test name not found
    contacts_data = {
        "1BitcoinAddress000000000000000005": ("address", "Eve"),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name("Mallory"); result = codeflash_output # 1.23μs -> 986ns (24.4% faster)

def test_by_name_basic_empty_contacts():
    # Test with no contacts
    db = DummyWalletDB(contacts={})
    contacts = Contacts(db)
    codeflash_output = contacts.by_name("Alice"); result = codeflash_output # 754ns -> 709ns (6.35% faster)

# EDGE TEST CASES

def test_by_name_edge_duplicate_names():
    # Test with duplicate names (should return first found)
    contacts_data = {
        "1BitcoinAddress000000000000000006": ("address", "Frank"),
        "1BitcoinAddress000000000000000007": ("address", "Frank"),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name("Frank"); result = codeflash_output # 1.46μs -> 1.16μs (25.8% faster)

def test_by_name_edge_name_is_empty_string():
    # Test with empty string as name
    contacts_data = {
        "1BitcoinAddress000000000000000008": ("address", ""),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name(""); result = codeflash_output # 1.44μs -> 1.08μs (33.3% faster)

def test_by_name_edge_name_is_whitespace():
    # Test with whitespace name
    contacts_data = {
        "1BitcoinAddress000000000000000009": ("address", " "),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name(" "); result = codeflash_output # 1.56μs -> 1.16μs (34.9% faster)


def test_by_name_edge_special_characters():
    # Test with special characters in name
    contacts_data = {
        "1BitcoinAddress000000000000000011": ("address", "@l!c3"),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name("@l!c3"); result = codeflash_output # 1.97μs -> 1.52μs (29.7% faster)

def test_by_name_edge_unicode_characters():
    # Test with unicode characters in name
    contacts_data = {
        "1BitcoinAddress000000000000000012": ("address", "Ålice"),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name("ålice"); result = codeflash_output # 2.36μs -> 1.86μs (26.6% faster)


def test_by_name_edge_contact_type_non_address():
    # Test contact with type not 'address'
    contacts_data = {
        "1BitcoinAddress000000000000000014": ("email", "Helen"),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name("Helen"); result = codeflash_output # 2.12μs -> 1.52μs (39.4% faster)

def test_by_name_edge_backward_compatibility():
    # Test backward compatibility: name is a valid address, type is 'address'
    contacts_data = {
        "Alice": ("address", "1BitcoinAddress000000000000000015"),
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name("Alice"); result = codeflash_output # 1.48μs -> 1.05μs (40.3% faster)
    codeflash_output = contacts.by_name("1BitcoinAddress000000000000000015"); result = codeflash_output # 932ns -> 866ns (7.62% faster)

# LARGE SCALE TEST CASES

def test_by_name_large_scale_many_contacts():
    # Test with 1000 contacts
    contacts_data = {}
    for i in range(1000):
        addr = f"1BitcoinAddress{i:021d}"
        name = f"User{i}"
        contacts_data[addr] = ("address", name)
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    # Check first, middle, last
    codeflash_output = contacts.by_name("User0") # 2.19μs -> 1.37μs (59.7% faster)
    codeflash_output = contacts.by_name("User500") # 47.8μs -> 26.9μs (78.0% faster)
    codeflash_output = contacts.by_name("User999") # 96.3μs -> 52.4μs (83.8% faster)
    # Check not found
    codeflash_output = contacts.by_name("User1000") # 93.7μs -> 50.5μs (85.7% faster)

def test_by_name_large_scale_performance():
    # Test performance with 1000 contacts (should not timeout)
    contacts_data = {}
    for i in range(1000):
        addr = f"1BitcoinAddress{i:021d}"
        name = f"Person{i}"
        contacts_data[addr] = ("address", name)
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    # Query near end
    codeflash_output = contacts.by_name("Person999"); result = codeflash_output # 100μs -> 54.6μs (84.9% faster)

def test_by_name_large_scale_duplicate_names():
    # Test with many contacts sharing the same name
    contacts_data = {}
    for i in range(1000):
        addr = f"1BitcoinAddress{i:021d}"
        contacts_data[addr] = ("address", "SharedName")
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name("SharedName"); result = codeflash_output # 1.84μs -> 1.33μs (38.6% faster)

def test_by_name_large_scale_long_names():
    # Test with very long names (up to 256 chars)
    long_name = "A" * 256
    contacts_data = {
        "1BitcoinAddress000000000000001000": ("address", long_name)
    }
    db = DummyWalletDB(contacts=contacts_data)
    contacts = Contacts(db)
    codeflash_output = contacts.by_name(long_name); result = codeflash_output # 1.93μs -> 1.48μs (29.7% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-Contacts.by_name-mhoyhvee and push.

Codeflash Static Badge

The optimization delivers a **79% speedup** by eliminating redundant dictionary lookups and string operations in the `by_name` method:

**Key optimizations:**

1. **Eliminate repeated `casefold()` calls**: The original code calls `name.casefold()` on every iteration. The optimized version calls it once upfront and stores the result in `name_cf`, avoiding thousands of redundant string transformations.

2. **Replace `keys()` + `__getitem__` with `items()` + unpacking**: The original iterates with `for k in self.keys()` then accesses values via `self[k]`, requiring dictionary hash lookups for each entry. The optimized version uses `for k, (_type, addr) in self.items()` which directly unpacks the key-value pairs in a single iteration, eliminating the hash lookups.

**Performance analysis from line profiler:**
- Total execution time reduced from 5.8ms to 3.85ms (33% improvement)
- The loop iteration overhead (line with `for`) improved from 31.4% to 42.6% of total time, but with lower absolute time
- The comparison line shows reduced per-hit time (350.4ns → 289.3ns per hit)

**Test case benefits:**
- Small contact lists (1-10 entries): 15-47% speedup
- Large contact lists (1000 entries): 75-87% speedup when searching later entries
- The optimization scales particularly well with larger datasets since it eliminates O(n) dictionary lookups

This optimization maintains identical behavior and return values while significantly improving performance, especially beneficial for applications with frequent contact lookups or large contact databases.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 7, 2025 14:34
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant