Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 5, 2025

📄 63% (0.63x) speedup for RedisDBConfig.validate_extra_fields in mem0/configs/vector_stores/redis.py

⏱️ Runtime : 42.4 microseconds 26.1 microseconds (best of 61 runs)

📝 Explanation and details

The optimization applies field caching to eliminate redundant computation in the validation method.

Key changes:

  • Caches allowed fields: Instead of calling set(cls.model_fields.keys()) on every validation, the code now caches the result as cls._allowed_fields using frozenset for O(1) membership testing.
  • One-time computation: The allowed fields are computed only once per class and reused across all subsequent validations.

Why this is faster:

  • Eliminates repeated work: Each validation previously required rebuilding the allowed fields set from cls.model_fields.keys(), which involves dictionary key extraction and set construction.
  • Frozenset optimization: Using frozenset instead of set provides faster membership testing for the set difference operation (input_fields - allowed_fields).
  • Amortized performance: The first call pays the caching cost, but all subsequent calls benefit from the cached result.

Performance gains by test type:

  • Valid input cases (62-117% faster): These benefit most since they only need the cached lookup without error string construction.
  • Invalid input cases (39-50% faster): Still faster due to cached field lookup, though error message generation limits the speedup.
  • Repeated validation scenarios: Would see even greater benefits as the cache is reused across multiple validations of the same class.

The 62% overall speedup demonstrates significant performance improvement for a common validation pattern, especially valuable if this validator runs frequently during model instantiation.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 425 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from typing import Any, Dict

# imports
import pytest  # used for our unit tests
from mem0.configs.vector_stores.redis import RedisDBConfig
# function to test
from pydantic import BaseModel, ConfigDict, Field, model_validator

# unit tests

# ---------- BASIC TEST CASES ----------

def test_valid_fields_only():
    # All fields present, no extras
    values = {
        "redis_url": "redis://localhost:6379",
        "collection_name": "test_collection",
        "embedding_model_dims": 768
    }
    # Should not raise
    codeflash_output = RedisDBConfig.validate_extra_fields(values) # 3.15μs -> 1.56μs (102% faster)

def test_missing_optional_field():
    # Only required field present, optional fields omitted
    values = {
        "redis_url": "redis://localhost:6379"
    }
    # Should not raise
    codeflash_output = RedisDBConfig.validate_extra_fields(values) # 2.51μs -> 1.34μs (87.7% faster)

def test_missing_all_fields():
    # No fields provided
    values = {}
    # Should not raise, as missing required fields is not checked here
    codeflash_output = RedisDBConfig.validate_extra_fields(values) # 2.57μs -> 1.18μs (117% faster)

def test_valid_fields_with_different_types():
    # Valid fields with different types for optional fields
    values = {
        "redis_url": "redis://localhost:6379",
        "collection_name": 123,
        "embedding_model_dims": "1536"
    }
    # Should not raise (type validation is not done here)
    codeflash_output = RedisDBConfig.validate_extra_fields(values) # 2.44μs -> 1.34μs (81.9% faster)

# ---------- EDGE TEST CASES ----------

def test_single_extra_field():
    # One extra field present
    values = {
        "redis_url": "redis://localhost:6379",
        "collection_name": "test_collection",
        "embedding_model_dims": 768,
        "extra_field": "should fail"
    }
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig.validate_extra_fields(values) # 4.17μs -> 3.00μs (38.9% faster)

def test_multiple_extra_fields():
    # Multiple extra fields present
    values = {
        "redis_url": "redis://localhost:6379",
        "collection_name": "test_collection",
        "embedding_model_dims": 768,
        "foo": "bar",
        "baz": 123
    }
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig.validate_extra_fields(values) # 4.45μs -> 3.13μs (42.3% faster)

def test_extra_fields_with_similar_names():
    # Extra field with similar name to allowed
    values = {
        "redis_url": "redis://localhost:6379",
        "collection_name": "test_collection",
        "embedding_model_dims": 768,
        "redis_url_": "typo"
    }
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig.validate_extra_fields(values) # 4.02μs -> 2.78μs (44.5% faster)

def test_extra_fields_with_empty_string_key():
    # Extra field with empty string as key
    values = {
        "redis_url": "redis://localhost:6379",
        "": "empty"
    }
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig.validate_extra_fields(values) # 3.88μs -> 2.63μs (47.8% faster)


def test_allowed_fields_with_none_values():
    # Allowed fields with None values
    values = {
        "redis_url": None,
        "collection_name": None,
        "embedding_model_dims": None
    }
    # Should not raise
    codeflash_output = RedisDBConfig.validate_extra_fields(values) # 3.96μs -> 1.94μs (104% faster)

def test_allowed_fields_with_unusual_types():
    # Allowed fields with unusual types
    values = {
        "redis_url": ["redis://localhost:6379"],
        "collection_name": {"name": "test"},
        "embedding_model_dims": (1536,)
    }
    # Should not raise
    codeflash_output = RedisDBConfig.validate_extra_fields(values) # 2.71μs -> 1.42μs (91.5% faster)

def test_extra_field_with_none_value():
    # Extra field with None value
    values = {
        "redis_url": "redis://localhost:6379",
        "extra_field": None
    }
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig.validate_extra_fields(values) # 4.39μs -> 2.98μs (47.5% faster)

def test_extra_field_with_empty_value():
    # Extra field with empty value
    values = {
        "redis_url": "redis://localhost:6379",
        "": ""
    }
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig.validate_extra_fields(values) # 4.18μs -> 2.79μs (49.6% faster)

# ---------- LARGE SCALE TEST CASES ----------





#------------------------------------------------
from typing import Any, Dict

# imports
import pytest
from mem0.configs.vector_stores.redis import RedisDBConfig
# function to test
from pydantic import BaseModel, ConfigDict, Field, model_validator

# unit tests

# 1. Basic Test Cases

def test_basic_only_required():
    """Test with only the required field."""
    config = RedisDBConfig(redis_url="redis://localhost:6379")

def test_basic_all_fields():
    """Test with all fields provided."""
    config = RedisDBConfig(
        redis_url="redis://localhost:6379",
        collection_name="test_collection",
        embedding_model_dims=2048
    )


def test_basic_extra_field():
    """Test with an extra field should raise ValueError."""
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig(
            redis_url="redis://localhost:6379",
            collection_name="foo",
            embedding_model_dims=1536,
            not_a_field="extra"
        )

def test_basic_typo_in_field():
    """Test with a typo in a field name."""
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig(
            redis_url="redis://localhost:6379",
            collection_nam="typo"  # typo here
        )

# 2. Edge Test Cases


def test_edge_all_fields_default():
    """Test with only required field, others default."""
    config = RedisDBConfig(redis_url="redis://localhost:6379")

def test_edge_field_case_sensitivity():
    """Test with field name in wrong case (should be treated as extra)."""
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig(
            redis_url="redis://localhost:6379",
            Collection_Name="wrongcase"
        )

def test_edge_multiple_extra_fields():
    """Test with multiple extra fields."""
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig(
            redis_url="redis://localhost:6379",
            foo="bar",
            baz="qux"
        )
    # Should list all allowed fields
    for field in ["redis_url", "collection_name", "embedding_model_dims"]:
        pass

def test_edge_empty_string_field_name():
    """Test with an empty string as a field name."""
    # This is not possible via keyword argument, but possible via dict unpacking
    values = {
        "redis_url": "redis://localhost:6379",
        "": "empty"
    }
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig(**values)


def test_edge_none_as_field_name():
    """Test with None as a key in the input dict."""
    values = {
        "redis_url": "redis://localhost:6379",
        None: "should_fail"
    }
    with pytest.raises(TypeError):
        RedisDBConfig(**values)  # None is not a valid identifier

def test_edge_numeric_field_name():
    """Test with a numeric field name."""
    values = {
        "redis_url": "redis://localhost:6379",
        123: "number"
    }
    with pytest.raises(TypeError):
        RedisDBConfig(**values)  # 123 is not a valid identifier

def test_edge_field_name_with_space():
    """Test with a field name containing a space."""
    values = {
        "redis_url": "redis://localhost:6379",
        "collection name": "withspace"
    }
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig(**values)

def test_edge_field_name_with_special_chars():
    """Test with a field name containing special characters."""
    values = {
        "redis_url": "redis://localhost:6379",
        "collection$name": "special"
    }
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig(**values)

# 3. Large Scale Test Cases

def test_large_scale_many_valid_configs():
    """Test creating many valid configs in a loop."""
    for i in range(100):
        config = RedisDBConfig(
            redis_url=f"redis://localhost:{6379+i}",
            collection_name=f"collection_{i}",
            embedding_model_dims=1536 + i
        )

def test_large_scale_many_invalid_configs():
    """Test creating many configs with extra fields."""
    for i in range(100):
        with pytest.raises(ValueError) as excinfo:
            RedisDBConfig(
                redis_url=f"redis://localhost:{6379+i}",
                collection_name=f"collection_{i}",
                embedding_model_dims=1536 + i,
                extra_field=f"extra_{i}"
            )

def test_large_scale_long_field_names():
    """Test with a very long field name as extra (should fail)."""
    long_field = "x" * 500
    values = {
        "redis_url": "redis://localhost:6379",
        long_field: "value"
    }
    with pytest.raises(ValueError) as excinfo:
        RedisDBConfig(**values)

def test_large_scale_bulk_dict_input():
    """Test bulk creation using dict unpacking with valid fields."""
    for i in range(100):
        values = {
            "redis_url": f"redis://localhost:{6379+i}",
            "collection_name": f"coll_{i}",
            "embedding_model_dims": 1000 + i
        }
        config = RedisDBConfig(**values)

def test_large_scale_bulk_dict_with_extra():
    """Test bulk creation using dict unpacking with an extra field."""
    for i in range(100):
        values = {
            "redis_url": f"redis://localhost:{6379+i}",
            "collection_name": f"coll_{i}",
            "embedding_model_dims": 1000 + i,
            f"extra_{i}": "bad"
        }
        with pytest.raises(ValueError) as excinfo:
            RedisDBConfig(**values)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-RedisDBConfig.validate_extra_fields-mhlco6j1 and push.

Codeflash Static Badge

The optimization applies **field caching** to eliminate redundant computation in the validation method. 

**Key changes:**
- **Caches allowed fields**: Instead of calling `set(cls.model_fields.keys())` on every validation, the code now caches the result as `cls._allowed_fields` using `frozenset` for O(1) membership testing.
- **One-time computation**: The allowed fields are computed only once per class and reused across all subsequent validations.

**Why this is faster:**
- **Eliminates repeated work**: Each validation previously required rebuilding the allowed fields set from `cls.model_fields.keys()`, which involves dictionary key extraction and set construction.
- **Frozenset optimization**: Using `frozenset` instead of `set` provides faster membership testing for the set difference operation (`input_fields - allowed_fields`).
- **Amortized performance**: The first call pays the caching cost, but all subsequent calls benefit from the cached result.

**Performance gains by test type:**
- **Valid input cases** (62-117% faster): These benefit most since they only need the cached lookup without error string construction.
- **Invalid input cases** (39-50% faster): Still faster due to cached field lookup, though error message generation limits the speedup.
- **Repeated validation scenarios**: Would see even greater benefits as the cache is reused across multiple validations of the same class.

The 62% overall speedup demonstrates significant performance improvement for a common validation pattern, especially valuable if this validator runs frequently during model instantiation.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 5, 2025 01:59
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant