Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 5, 2025

📄 24% (0.24x) speedup for ElasticsearchConfig.validate_auth in mem0/configs/vector_stores/elasticsearch.py

⏱️ Runtime : 10.0 microseconds 8.07 microseconds (best of 244 runs)

📝 Explanation and details

The optimization replaces any([values.get("api_key"), (values.get("user") and values.get("password"))]) with a direct boolean expression not (values.get("api_key") or (values.get("user") and values.get("password"))).

Key optimization:

  • Eliminates list creation: The original code creates a temporary list [values.get("api_key"), (values.get("user") and values.get("password"))] and passes it to any(), which requires memory allocation and iteration.
  • Direct boolean evaluation: The optimized version uses short-circuit evaluation with or, which stops as soon as the first truthy condition is found, avoiding unnecessary computation.
  • Reduced function call overhead: Removes the any() function call, directly evaluating the boolean logic.

Performance impact:
The 24% speedup comes from eliminating the temporary list allocation and the any() function call overhead. Python's or operator with short-circuit evaluation is significantly faster than constructing a list and iterating through it.

Test case benefits:
Based on the annotated tests, this optimization is particularly effective for:

  • Basic validation cases where the first condition (api_key present) is true - short-circuiting avoids evaluating the second condition entirely
  • Large scale tests creating hundreds of configs - the micro-optimization compounds across many validation calls
  • Edge cases with missing authentication - faster failure path when both conditions are false

This is a config validation class that likely gets instantiated frequently during application startup or configuration changes, making this micro-optimization worthwhile despite the small absolute time savings.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 2363 Passed
⏪ Replay Tests 18 Passed
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from collections.abc import Callable
from typing import Any, Dict, List, Optional

# imports
import pytest  # used for our unit tests
from mem0.configs.vector_stores.elasticsearch import ElasticsearchConfig
from pydantic import BaseModel, Field, model_validator

# unit tests

# ---- BASIC TEST CASES ----

def test_valid_host_user_password():
    # Basic: host + user/password
    cfg = ElasticsearchConfig(host="localhost", user="alice", password="secret")

def test_valid_cloud_id_api_key():
    # Basic: cloud_id + api_key
    cfg = ElasticsearchConfig(cloud_id="cloud123", api_key="key123")

def test_valid_host_api_key():
    # Basic: host + api_key
    cfg = ElasticsearchConfig(host="es.example.com", api_key="key456")

def test_valid_cloud_id_user_password():
    # Basic: cloud_id + user/password
    cfg = ElasticsearchConfig(cloud_id="cloud456", user="bob", password="pass456")

def test_valid_host_and_cloud_id_with_api_key():
    # Basic: both host and cloud_id present, with api_key
    cfg = ElasticsearchConfig(host="localhost", cloud_id="cloud789", api_key="key789")

# ---- EDGE TEST CASES ----

def test_missing_auth_raises():
    # Edge: host present, but no api_key or user/password
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost")

def test_missing_host_and_cloud_id_raises():
    # Edge: neither host nor cloud_id present, with api_key
    with pytest.raises(ValueError, match="Either cloud_id or host must be provided"):
        ElasticsearchConfig(api_key="key123")

def test_missing_host_and_cloud_id_with_user_password_raises():
    # Edge: neither host nor cloud_id present, with user/password
    with pytest.raises(ValueError, match="Either cloud_id or host must be provided"):
        ElasticsearchConfig(user="alice", password="secret")

def test_user_without_password_raises():
    # Edge: host present, user present, password missing
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost", user="alice")

def test_password_without_user_raises():
    # Edge: host present, password present, user missing
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost", password="secret")

def test_empty_user_and_password_raises():
    # Edge: host present, user and password are empty strings
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost", user="", password="")

def test_api_key_empty_string_raises():
    # Edge: host present, api_key is empty string
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost", api_key="")

def test_cloud_id_empty_string_with_valid_auth():
    # Edge: cloud_id is empty string, host present, valid auth
    cfg = ElasticsearchConfig(host="localhost", cloud_id="", api_key="key123")

def test_host_empty_string_with_valid_auth():
    # Edge: host is empty string, cloud_id present, valid auth
    cfg = ElasticsearchConfig(host="", cloud_id="cloud123", user="alice", password="secret")

def test_both_host_and_cloud_id_missing_and_auth_missing():
    # Edge: both host and cloud_id missing, and no auth
    with pytest.raises(ValueError, match="Either cloud_id or host must be provided"):
        ElasticsearchConfig()


def test_custom_search_query_and_headers_are_ignored_for_auth():
    # Edge: custom_search_query and headers should not affect auth validation
    def dummy_search(q, l, f): return {}
    cfg = ElasticsearchConfig(host="localhost", api_key="key123", custom_search_query=dummy_search, headers={"x": "y"})

# ---- LARGE SCALE TEST CASES ----

def test_large_headers_dict():
    # Large: headers dict with 1000 elements
    headers = {f"header{i}": f"value{i}" for i in range(1000)}
    cfg = ElasticsearchConfig(host="localhost", api_key="key123", headers=headers)

def test_large_custom_search_query_list():
    # Large: custom_search_query returns a large dict, but should not affect auth
    def large_search(q, l, f):
        return {str(i): i for i in range(1000)}
    cfg = ElasticsearchConfig(host="localhost", api_key="key123", custom_search_query=large_search)
    result = cfg.custom_search_query([], 10, None)

def test_many_instances_valid():
    # Large: create 1000 valid configs with different host/api_key
    for i in range(1000):
        cfg = ElasticsearchConfig(host=f"host{i}", api_key=f"key{i}")

def test_many_instances_invalid_auth():
    # Large: create 1000 configs with missing auth, all should raise
    for i in range(10):  # Limit to 10 for speed
        with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
            ElasticsearchConfig(host=f"host{i}")

def test_many_instances_invalid_host_cloud_id():
    # Large: create 10 configs with missing host/cloud_id, all should raise
    for i in range(10):  # Limit to 10 for speed
        with pytest.raises(ValueError, match="Either cloud_id or host must be provided"):
            ElasticsearchConfig(api_key=f"key{i}")

def test_large_string_values():
    # Large: very long string values for host, user, password, api_key
    long_str = "x" * 1000
    cfg = ElasticsearchConfig(host=long_str, user=long_str, password=long_str)

def test_large_collection_name():
    # Large: very long collection_name, should not affect auth
    long_name = "collection_" + "a" * 950
    cfg = ElasticsearchConfig(host="localhost", api_key="key123", collection_name=long_name)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from collections.abc import Callable
from typing import Any, Dict, List, Optional

# imports
import pytest  # used for our unit tests
from mem0.configs.vector_stores.elasticsearch import ElasticsearchConfig
from pydantic import BaseModel, Field, model_validator

# unit tests

# ---------------- Basic Test Cases ----------------

def test_valid_host_and_user_password():
    # Basic: host and user/password provided, should pass
    config = ElasticsearchConfig(host="localhost", user="admin", password="secret")

def test_valid_cloud_id_and_api_key():
    # Basic: cloud_id and api_key provided, should pass
    config = ElasticsearchConfig(cloud_id="cloud123", api_key="key123")

def test_valid_host_and_api_key():
    # Basic: host and api_key provided, should pass
    config = ElasticsearchConfig(host="127.0.0.1", api_key="apikey")

def test_valid_cloud_id_and_user_password():
    # Basic: cloud_id and user/password provided, should pass
    config = ElasticsearchConfig(cloud_id="cloud456", user="user", password="pass")

# ---------------- Edge Test Cases ----------------

def test_missing_host_and_cloud_id():
    # Edge: missing both host and cloud_id, should raise ValueError
    with pytest.raises(ValueError, match="Either cloud_id or host must be provided"):
        ElasticsearchConfig(user="admin", password="secret")

def test_missing_authentication():
    # Edge: host provided, but no api_key or user/password, should raise ValueError
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost")

def test_host_with_only_user():
    # Edge: host and user provided, but no password, should raise ValueError
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost", user="admin")

def test_host_with_only_password():
    # Edge: host and password provided, but no user, should raise ValueError
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost", password="secret")

def test_cloud_id_with_only_user():
    # Edge: cloud_id and user provided, but no password, should raise ValueError
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(cloud_id="cloudid", user="admin")

def test_cloud_id_with_only_password():
    # Edge: cloud_id and password provided, but no user, should raise ValueError
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(cloud_id="cloudid", password="secret")

def test_host_and_cloud_id_with_api_key():
    # Edge: both host and cloud_id provided with api_key, should pass
    config = ElasticsearchConfig(host="localhost", cloud_id="cloudid", api_key="key")

def test_host_and_cloud_id_with_user_password():
    # Edge: both host and cloud_id provided with user/password, should pass
    config = ElasticsearchConfig(host="localhost", cloud_id="cloudid", user="user", password="pass")

def test_empty_strings_for_auth_fields():
    # Edge: host provided, user/password are empty strings, should raise ValueError
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost", user="", password="")

def test_none_values_for_auth_fields():
    # Edge: host provided, user/password are None, should raise ValueError
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost", user=None, password=None)

def test_api_key_empty_string():
    # Edge: host provided, api_key is empty string, should raise ValueError
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost", api_key="")

def test_api_key_none():
    # Edge: host provided, api_key is None, should raise ValueError
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost", api_key=None)

def test_user_and_password_empty_string_and_none():
    # Edge: host provided, user is empty string, password is None, should raise ValueError
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost", user="", password=None)

def test_user_and_password_none_and_empty_string():
    # Edge: host provided, user is None, password is empty string, should raise ValueError
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost", user=None, password="")

def test_host_and_api_key_and_user_password():
    # Edge: host provided, both api_key and user/password provided, should pass
    config = ElasticsearchConfig(host="localhost", api_key="key", user="admin", password="secret")

def test_cloud_id_and_api_key_and_user_password():
    # Edge: cloud_id provided, both api_key and user/password provided, should pass
    config = ElasticsearchConfig(cloud_id="cloudid", api_key="key", user="admin", password="secret")

def test_host_and_cloud_id_missing_auth():
    # Edge: host and cloud_id provided, but no authentication, should raise ValueError
    with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
        ElasticsearchConfig(host="localhost", cloud_id="cloudid")

def test_host_missing_but_cloud_id_and_auth_present():
    # Edge: host missing, but cloud_id and api_key present, should pass
    config = ElasticsearchConfig(cloud_id="cloudid", api_key="key")

def test_host_missing_but_cloud_id_and_user_password_present():
    # Edge: host missing, but cloud_id and user/password present, should pass
    config = ElasticsearchConfig(cloud_id="cloudid", user="admin", password="secret")

# ---------------- Large Scale Test Cases ----------------

def test_large_number_of_configs_with_valid_auth():
    # Large scale: create 500 configs with valid host and user/password, all should pass
    for i in range(500):
        config = ElasticsearchConfig(host=f"host{i}", user=f"user{i}", password=f"pass{i}")

def test_large_number_of_configs_with_valid_cloud_id_and_api_key():
    # Large scale: create 500 configs with valid cloud_id and api_key, all should pass
    for i in range(500):
        config = ElasticsearchConfig(cloud_id=f"cloud{i}", api_key=f"key{i}")

def test_large_number_of_configs_missing_authentication():
    # Large scale: create 100 configs missing authentication, all should raise ValueError
    for i in range(100):
        with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
            ElasticsearchConfig(host=f"host{i}")

def test_large_number_of_configs_missing_host_and_cloud_id():
    # Large scale: create 100 configs missing host and cloud_id, all should raise ValueError
    for i in range(100):
        with pytest.raises(ValueError, match="Either cloud_id or host must be provided"):
            ElasticsearchConfig(user=f"user{i}", password=f"pass{i}")

def test_large_number_of_configs_with_mixed_auth():
    # Large scale: alternate between valid and invalid configs
    for i in range(100):
        if i % 2 == 0:
            # Valid: host and api_key
            config = ElasticsearchConfig(host=f"host{i}", api_key=f"key{i}")
        else:
            # Invalid: host only
            with pytest.raises(ValueError, match="Either api_key or user/password must be provided"):
                ElasticsearchConfig(host=f"host{i}")
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
⏪ Replay Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_pytest_testsvector_storestest_opensearch_py_testsvector_storestest_upstash_vector_py_testsllmstest_l__replay_test_0.py::test_mem0_configs_vector_stores_elasticsearch_ElasticsearchConfig_validate_auth 10.0μs 8.07μs 24.2%✅

To edit these changes git checkout codeflash/optimize-ElasticsearchConfig.validate_auth-mhlmpyj3 and push.

Codeflash Static Badge

The optimization replaces `any([values.get("api_key"), (values.get("user") and values.get("password"))])` with a direct boolean expression `not (values.get("api_key") or (values.get("user") and values.get("password")))`.

**Key optimization:**
- **Eliminates list creation**: The original code creates a temporary list `[values.get("api_key"), (values.get("user") and values.get("password"))]` and passes it to `any()`, which requires memory allocation and iteration.
- **Direct boolean evaluation**: The optimized version uses short-circuit evaluation with `or`, which stops as soon as the first truthy condition is found, avoiding unnecessary computation.
- **Reduced function call overhead**: Removes the `any()` function call, directly evaluating the boolean logic.

**Performance impact:**
The 24% speedup comes from eliminating the temporary list allocation and the `any()` function call overhead. Python's `or` operator with short-circuit evaluation is significantly faster than constructing a list and iterating through it.

**Test case benefits:**
Based on the annotated tests, this optimization is particularly effective for:
- **Basic validation cases** where the first condition (`api_key` present) is true - short-circuiting avoids evaluating the second condition entirely
- **Large scale tests** creating hundreds of configs - the micro-optimization compounds across many validation calls
- **Edge cases** with missing authentication - faster failure path when both conditions are false

This is a config validation class that likely gets instantiated frequently during application startup or configuration changes, making this micro-optimization worthwhile despite the small absolute time savings.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 5, 2025 06:41
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant