⚡️ Speed up function extract_missing_module_from_cause_chain by 40%
#591
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 40% (0.40x) speedup for
extract_missing_module_from_cause_chaininmarimo/_runtime/packages/import_error_extractors.py⏱️ Runtime :
152 microseconds→108 microseconds(best of5runs)📝 Explanation and details
The optimized code achieves a 40% speedup by replacing expensive Python introspection operations with faster type checks when traversing exception cause chains.
Key Optimizations:
type()identity check instead ofisinstance(): Replacesisinstance(current, ModuleNotFoundError)withtype(current) is ModuleNotFoundErrorType. The identity check (is) is significantly faster thanisinstance()which performs inheritance chain traversal - critical when processing deep cause chains.Eliminates
hasattr()call: Removes thehasattr(current, "name")check sinceModuleNotFoundErroralways has anameattribute by design. This saves a costly attribute lookup operation on every iteration.Direct attribute access: Accesses
current.namedirectly and stores it in a local variable, reducing redundant attribute lookups within the conditional logic.Performance Impact by Test Case:
isinstance()andhasattr()overheadWhy This Works:
The original code used defensive programming with
isinstance()andhasattr(), butModuleNotFoundErroris a built-in exception type that's rarely subclassed. Thetype() ischeck is both safe and dramatically faster, especially in the common scenario of traversing long exception chains during import resolution where this function likely operates in hot paths.✅ Correctness verification report:
⚙️ Existing Unit Tests and Runtime
_runtime/packages/test_import_error_extractors.py::test_extract_missing_module_from_cause_chain_direct_runtime/packages/test_import_error_extractors.py::test_extract_missing_module_from_cause_chain_nested_runtime/packages/test_import_error_extractors.py::test_extract_missing_module_from_cause_chain_no_module_runtime/packages/test_import_error_extractors.py::test_extract_missing_module_from_cause_chain_with_cause🌀 Generated Regression Tests and Runtime
import pytest
from marimo._runtime.packages.import_error_extractors import
extract_missing_module_from_cause_chain
----------------- UNIT TESTS -----------------
1. Basic Test Cases
def test_direct_modulenotfounderror_with_name():
# Direct ModuleNotFoundError with a name
err = ModuleNotFoundError("No module named 'foo'")
err.name = "foo"
codeflash_output = extract_missing_module_from_cause_chain(err); result = codeflash_output # 886ns -> 709ns (25.0% faster)
def test_importerror_wrapping_modulenotfounderror():
# ImportError wrapping a ModuleNotFoundError with a name
mnfe = ModuleNotFoundError("No module named 'bar'")
mnfe.name = "bar"
err = ImportError("wrapped error")
err.cause = mnfe
codeflash_output = extract_missing_module_from_cause_chain(err); result = codeflash_output # 1.16μs -> 888ns (30.1% faster)
def test_importerror_with_no_cause_returns_none():
# ImportError with no cause should return None
err = ImportError("no cause")
codeflash_output = extract_missing_module_from_cause_chain(err); result = codeflash_output # 784ns -> 750ns (4.53% faster)
def test_importerror_wrapping_importerror_no_modulenotfound():
# ImportError wrapping another ImportError, no ModuleNotFoundError in chain
inner = ImportError("inner")
outer = ImportError("outer")
outer.cause = inner
codeflash_output = extract_missing_module_from_cause_chain(outer); result = codeflash_output # 935ns -> 869ns (7.59% faster)
2. Edge Test Cases
def test_modulenotfounderror_with_empty_name():
# ModuleNotFoundError with empty name should return None
mnfe = ModuleNotFoundError("No module named ''")
mnfe.name = ""
codeflash_output = extract_missing_module_from_cause_chain(mnfe); result = codeflash_output # 962ns -> 813ns (18.3% faster)
def test_modulenotfounderror_with_none_name():
# ModuleNotFoundError with None as name should return None
mnfe = ModuleNotFoundError("No module named None")
mnfe.name = None
codeflash_output = extract_missing_module_from_cause_chain(mnfe); result = codeflash_output # 920ns -> 820ns (12.2% faster)
def test_importerror_with_long_cause_chain_finds_module():
# ImportError -> ImportError -> ModuleNotFoundError (with name)
mnfe = ModuleNotFoundError("No module named 'baz'")
mnfe.name = "baz"
err2 = ImportError("level 2")
err2.cause = mnfe
err1 = ImportError("level 1")
err1.cause = err2
codeflash_output = extract_missing_module_from_cause_chain(err1); result = codeflash_output # 1.18μs -> 986ns (19.8% faster)
def test_importerror_with_cause_chain_no_modulenotfounderror():
# Long chain with no ModuleNotFoundError
err3 = ImportError("level 3")
err2 = ImportError("level 2")
err2.cause = err3
err1 = ImportError("level 1")
err1.cause = err2
codeflash_output = extract_missing_module_from_cause_chain(err1); result = codeflash_output # 982ns -> 804ns (22.1% faster)
def test_modulenotfounderror_missing_name_attribute():
# ModuleNotFoundError with no 'name' attribute (simulate by deleting)
mnfe = ModuleNotFoundError("No module named 'qux'")
if hasattr(mnfe, "name"):
del mnfe.name
codeflash_output = extract_missing_module_from_cause_chain(mnfe); result = codeflash_output # 895ns -> 803ns (11.5% faster)
def test_importerror_with_modulenotfounderror_without_name_attribute():
# ModuleNotFoundError with no name attribute (simulate by deleting)
mnfe = ModuleNotFoundError("No module named 'ghost'")
if hasattr(mnfe, "name"):
del mnfe.name
err = ImportError("wraps ghost")
err.cause = mnfe
codeflash_output = extract_missing_module_from_cause_chain(err); result = codeflash_output # 1.53μs -> 1.23μs (24.2% faster)
def test_importerror_with_modulenotfounderror_name_is_falsey_but_nonempty():
# ModuleNotFoundError with name set to False (should skip)
mnfe = ModuleNotFoundError("No module named 'falsey'")
mnfe.name = False
err = ImportError("wraps falsey")
err.cause = mnfe
codeflash_output = extract_missing_module_from_cause_chain(err); result = codeflash_output # 1.21μs -> 1.05μs (15.3% faster)
3. Large Scale Test Cases
def test_long_cause_chain_finds_first_modulenotfounderror():
# Long chain: ImportError -> ... -> ModuleNotFoundError (with name)
chain_length = 100
mnfe = ModuleNotFoundError("No module named 'bigmod'")
mnfe.name = "bigmod"
# Build chain: ImportError -> ImportError -> ... -> mnfe
prev = mnfe
for i in range(chain_length):
err = ImportError(f"level {i}")
err.cause = prev
prev = err
# Now prev is the head of the chain
codeflash_output = extract_missing_module_from_cause_chain(prev); result = codeflash_output # 6.30μs -> 4.34μs (45.1% faster)
def test_long_cause_chain_with_multiple_modulenotfounderrors_returns_first():
# Chain: ImportError -> MNFE1 (name='foo') -> ImportError -> MNFE2 (name='bar')
mnfe2 = ModuleNotFoundError("No module named 'bar'")
mnfe2.name = "bar"
err2 = ImportError("level 2")
err2.cause = mnfe2
mnfe1 = ModuleNotFoundError("No module named 'foo'")
mnfe1.name = "foo"
err1 = ImportError("level 1")
err1.cause = mnfe1
err0 = ImportError("top")
err0.cause = err2
# Connect err2 to mnfe1 to make the chain: err0 -> err2 -> mnfe1 -> err1 -> mnfe2
mnfe1.cause = err1
err2.cause = mnfe1
codeflash_output = extract_missing_module_from_cause_chain(err0); result = codeflash_output # 1.27μs -> 922ns (38.2% faster)
def test_large_chain_with_no_modulenotfounderror():
# Long chain of ImportErrors, no ModuleNotFoundError
chain_length = 500
prev = ImportError("tail")
for i in range(chain_length):
err = ImportError(f"level {i}")
err.cause = prev
prev = err
codeflash_output = extract_missing_module_from_cause_chain(prev); result = codeflash_output # 26.4μs -> 17.7μs (48.5% faster)
def test_large_chain_with_modulenotfounderror_at_end():
# Long chain, ModuleNotFoundError at the very end
chain_length = 250
mnfe = ModuleNotFoundError("No module named 'deepmod'")
mnfe.name = "deepmod"
prev = mnfe
for i in range(chain_length):
err = ImportError(f"level {i}")
err.cause = prev
prev = err
codeflash_output = extract_missing_module_from_cause_chain(prev); result = codeflash_output # 14.0μs -> 9.46μs (47.8% faster)
#------------------------------------------------
from future import annotations
imports
import pytest
from marimo._runtime.packages.import_error_extractors import
extract_missing_module_from_cause_chain
unit tests
----------- Basic Test Cases -----------
def test_direct_modulenotfounderror():
# Direct ModuleNotFoundError with name set
err = ModuleNotFoundError("No module named 'foo'")
err.name = "foo"
codeflash_output = extract_missing_module_from_cause_chain(err) # 891ns -> 705ns (26.4% faster)
def test_importerror_wrapping_modulenotfounderror():
# ImportError wrapping ModuleNotFoundError with name set
cause = ModuleNotFoundError("No module named 'bar'")
cause.name = "bar"
err = ImportError("Import failed") # wrap cause
err.cause = cause
codeflash_output = extract_missing_module_from_cause_chain(err) # 1.25μs -> 920ns (35.5% faster)
def test_importerror_no_cause():
# ImportError with no cause
err = ImportError("No module named 'baz'")
codeflash_output = extract_missing_module_from_cause_chain(err) # 800ns -> 755ns (5.96% faster)
def test_importerror_with_non_modulenotfounderror_cause():
# ImportError with a ValueError as cause
cause = ValueError("Some other error")
err = ImportError("Import failed")
err.cause = cause
codeflash_output = extract_missing_module_from_cause_chain(err) # 1.36μs -> 1.03μs (32.8% faster)
def test_importerror_with_modulenotfounderror_no_name():
# ModuleNotFoundError with name attribute missing or None
cause = ModuleNotFoundError("No module named 'qux'")
cause.name = None
err = ImportError("Import failed")
err.cause = cause
codeflash_output = extract_missing_module_from_cause_chain(err) # 1.18μs -> 1.01μs (17.1% faster)
----------- Edge Test Cases -----------
def test_deep_cause_chain():
# ImportError -> ValueError -> ModuleNotFoundError with name set
mnfe = ModuleNotFoundError("No module named 'deepmod'")
mnfe.name = "deepmod"
ve = ValueError("Intermediate error")
ve.cause = mnfe
err = ImportError("Import failed")
err.cause = ve
codeflash_output = extract_missing_module_from_cause_chain(err) # 1.21μs -> 959ns (26.4% faster)
def test_cause_chain_with_multiple_modulenotfounderrors():
# ImportError -> MNFE1 (name=None) -> MNFE2 (name set)
mnfe2 = ModuleNotFoundError("No module named 'realmod'")
mnfe2.name = "realmod"
mnfe1 = ModuleNotFoundError("No module named 'fakemod'")
mnfe1.name = None
mnfe1.cause = mnfe2
err = ImportError("Import failed")
err.cause = mnfe1
codeflash_output = extract_missing_module_from_cause_chain(err) # 1.31μs -> 1.04μs (26.3% faster)
def test_cause_chain_with_cycle():
# Create a cycle in the cause chain: err -> mnfe -> err
mnfe = ModuleNotFoundError("No module named 'cyclemod'")
mnfe.name = "cyclemod"
err = ImportError("Import failed")
err.cause = mnfe
mnfe.cause = err # cycle
# Should not infinite loop, but will return on first MNFE
codeflash_output = extract_missing_module_from_cause_chain(err) # 1.11μs -> 759ns (46.2% faster)
def test_modulenotfounderror_without_name_attribute():
# Remove 'name' attribute from ModuleNotFoundError
mnfe = ModuleNotFoundError("No module named 'noname'")
if hasattr(mnfe, "name"):
delattr(mnfe, "name")
err = ImportError("Import failed")
err.cause = mnfe
codeflash_output = extract_missing_module_from_cause_chain(err) # 1.15μs -> 1.00μs (14.4% faster)
def test_importerror_with_none_cause():
# ImportError with cause explicitly set to None
err = ImportError("Import failed")
err.cause = None
codeflash_output = extract_missing_module_from_cause_chain(err) # 744ns -> 667ns (11.5% faster)
def test_modulenotfounderror_with_empty_name():
# ModuleNotFoundError with name set to empty string
mnfe = ModuleNotFoundError("No module named ''")
mnfe.name = ""
err = ImportError("Import failed")
err.cause = mnfe
codeflash_output = extract_missing_module_from_cause_chain(err) # 1.18μs -> 1.01μs (16.9% faster)
----------- Large Scale Test Cases -----------
def test_large_cause_chain_performance():
# Create a chain of 500 ValueErrors, ending with ModuleNotFoundError
chain_length = 500
mnfe = ModuleNotFoundError("No module named 'largemod'")
mnfe.name = "largemod"
prev = mnfe
for i in range(chain_length):
ve = ValueError(f"Error {i}")
ve.cause = prev
prev = ve
err = ImportError("Import failed")
err.cause = prev
codeflash_output = extract_missing_module_from_cause_chain(err) # 26.8μs -> 18.3μs (46.5% faster)
def test_large_cause_chain_no_modulenotfounderror():
# Chain of 500 ValueErrors, no ModuleNotFoundError at all
chain_length = 500
prev = None
for i in range(chain_length):
ve = ValueError(f"Error {i}")
ve.cause = prev
prev = ve
err = ImportError("Import failed")
err.cause = prev
codeflash_output = extract_missing_module_from_cause_chain(err) # 26.5μs -> 18.3μs (44.4% faster)
def test_large_cause_chain_multiple_modulenotfounderrors():
# Chain with multiple MNFEs, only first with name set
chain_length = 250
mnfe1 = ModuleNotFoundError("No module named 'firstmod'")
mnfe1.name = "firstmod"
mnfe2 = ModuleNotFoundError("No module named 'secondmod'")
mnfe2.name = "secondmod"
mnfe1.cause = mnfe2
prev = mnfe1
for i in range(chain_length):
ve = ValueError(f"Error {i}")
ve.cause = prev
prev = ve
err = ImportError("Import failed")
err.cause = prev
# Should return the first MNFE with a name, i.e., 'firstmod'
codeflash_output = extract_missing_module_from_cause_chain(err) # 13.8μs -> 9.31μs (48.1% faster)
def test_large_cause_chain_modulenotfounderror_with_empty_name():
# Chain of 100 ValueErrors, MNFE with empty name at end
chain_length = 100
mnfe = ModuleNotFoundError("No module named ''")
mnfe.name = ""
prev = mnfe
for i in range(chain_length):
ve = ValueError(f"Error {i}")
ve.cause = prev
prev = ve
err = ImportError("Import failed")
err.cause = prev
codeflash_output = extract_missing_module_from_cause_chain(err) # 6.26μs -> 4.45μs (40.6% faster)
----------- Miscellaneous -----------
def test_non_importerror_input():
# Function expects ImportError, but what if passed ValueError?
ve = ValueError("Not an import error")
codeflash_output = extract_missing_module_from_cause_chain(ve) # 744ns -> 702ns (5.98% faster)
def test_modulenotfounderror_with_non_string_name():
# MNFE with name set to an integer
mnfe = ModuleNotFoundError("No module named '123'")
mnfe.name = 123
err = ImportError("Import failed")
err.cause = mnfe
# Should return the integer value, since it only checks for truthiness
codeflash_output = extract_missing_module_from_cause_chain(err) # 1.16μs -> 888ns (30.6% faster)
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from marimo._runtime.packages.import_error_extractors import extract_missing_module_from_cause_chain
def test_extract_missing_module_from_cause_chain():
extract_missing_module_from_cause_chain(ImportError())
🔎 Concolic Coverage Tests and Runtime
codeflash_concolic_bps3n5s8/tmpqc3okoyk/test_concolic_coverage.py::test_extract_missing_module_from_cause_chainTo edit these changes
git checkout codeflash/optimize-extract_missing_module_from_cause_chain-mhv3txqaand push.