Skip to content

Conversation

@mashraf-222
Copy link
Contributor

@mashraf-222 mashraf-222 commented Nov 28, 2025

PR Type

Bug fix, Enhancement


Description

  • Normalize paths for Windows compatibility

  • Use worktree paths in LSP initialization

  • Robust relative path computation with fallbacks

  • Case-insensitive filtering of files


Diagram Walkthrough

flowchart LR
  A["Discovery: filter_functions"] -- "normcase comparisons" --> B["Accurate test/ignore/submodule filtering"]
  C["LSP: initialize_function_optimization"] -- "use worktree paths" --> D["Correct files_inside_context on Windows"]
  E["Optimizer: mirror_path"] -- "resolve + normalized fallback" --> F["Robust relative path mapping"]
Loading

File Walkthrough

Relevant files
Bug fix
functions_to_optimize.py
Case-insensitive path filtering for Windows                           

codeflash/discovery/functions_to_optimize.py

  • Normalize roots and file paths with normcase.
  • Case-insensitive startswith checks for tests, ignore, submodules.
  • Ensure module root filtering uses normalized paths.
+8/-4     
beta.py
Use resolved worktree paths in LSP init                                   

codeflash/lsp/beta.py

  • Use worktree file path instead of document.path.
  • Resolve helper paths before returning file list.
  • Prevent malformed Windows paths (missing drive letter).
+5/-2     
Enhancement
optimizer.py
Robust mirror_path with Windows-safe normalization             

codeflash/optimization/optimizer.py

  • Resolve paths and normalize case before relative_to.
  • Add robust fallback when relative_to fails.
  • Preserve original formatting when extracting relative path.
+42/-1   

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Logic Change

The path checks now require a trailing separator for startswith comparisons. Verify behavior for files exactly equal to tests_root, ignore_path, submodule_path, and module_root to ensure they are still classified correctly.

if file_path_normalized.startswith(tests_root_normalized + os.sep):
    test_functions_removed_count += len(_functions)
    continue
if file_path in ignore_paths or any(
    file_path_normalized.startswith(os.path.normcase(str(ignore_path)) + os.sep) for ignore_path in ignore_paths
):
    ignore_paths_removed_count += 1
    continue
if file_path in submodule_paths or any(
    file_path_normalized.startswith(os.path.normcase(str(submodule_path)) + os.sep) for submodule_path in submodule_paths
):
    submodule_ignored_paths_count += 1
    continue
if path_belongs_to_site_packages(Path(file_path)):
    site_packages_removed_count += len(_functions)
    continue
if not file_path_normalized.startswith(module_root_normalized + os.sep):
    non_modules_removed_count += len(_functions)
    continue
Path Source

Switching from document.path to server.optimizer.args.file.resolve() changes which file is reported in files_inside_context. Confirm this aligns with client expectations and multi-file buffers, and that helpers are resolved to existing paths on all platforms.

# Use the worktree file path (which is normalized) instead of document.path
# document.path might be malformed on Windows (missing drive letter)
worktree_file_path = str(server.optimizer.args.file.resolve())
files = [worktree_file_path]

_, _, original_helpers = server.current_optimization_init_result
files.extend([str(helper_path.resolve()) for helper_path in original_helpers])

return {"functionName": params.functionName, "status": "success", "files_inside_context": files}
Error Handling

mirror_path now raises ValueError when paths are not under src_root. Ensure callers handle this exception; consider logging context or returning the original path when mirroring is impossible if that is acceptable.

# Resolve both paths to absolute paths to handle Windows path normalization issues
# This ensures paths with/without drive letters are handled correctly
try:
    path_resolved = path.resolve()
except (OSError, RuntimeError):
    # If resolve fails (e.g., path doesn't exist or is malformed), try absolute()
    path_resolved = path.absolute() if not path.is_absolute() else path

try:
    src_root_resolved = src_root.resolve()
except (OSError, RuntimeError):
    src_root_resolved = src_root.absolute() if not src_root.is_absolute() else src_root

# Normalize paths using os.path.normpath and normcase for cross-platform compatibility
path_str = os.path.normpath(str(path_resolved))
src_root_str = os.path.normpath(str(src_root_resolved))

# Convert to lowercase for case-insensitive comparison on Windows
path_normalized = os.path.normcase(path_str)
src_root_normalized = os.path.normcase(src_root_str)

# Try using Path.relative_to with resolved paths first
try:
    relative_path = path_resolved.relative_to(src_root_resolved)
except ValueError:
    # If relative_to fails, manually extract the relative path using normalized strings
    if path_normalized.startswith(src_root_normalized):
        # Extract relative path manually
        # Use the original path_str to preserve proper path format
        if path_str.startswith(src_root_str):
            relative_str = path_str[len(src_root_str) :].lstrip(os.sep)
            relative_path = Path(relative_str)
        else:
            # Fallback: use normalized paths
            relative_str = path_normalized[len(src_root_normalized) :].lstrip(os.sep)
            relative_path = Path(relative_str)
    else:
        raise ValueError(
            f"Path {path_resolved} (normalized: {path_normalized}) is not relative to "
            f"{src_root_resolved} (normalized: {src_root_normalized})"
        )

return dest_root / relative_path

@github-actions
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent prefix false positives

Guard against missing or extra separators when slicing; without ensuring a trailing
separator on the root, parents like "C:\proj" will match "C:\projectX". Append
os.sep to the normalized roots before startswith/slicing.

codeflash/optimization/optimizer.py [514-528]

-if path_normalized.startswith(src_root_normalized):
-    # Extract relative path manually
-    # Use the original path_str to preserve proper path format
-    if path_str.startswith(src_root_str):
-        relative_str = path_str[len(src_root_str) :].lstrip(os.sep)
+root_norm_with_sep = src_root_normalized.rstrip(os.sep) + os.sep
+path_norm_with_sep = path_normalized
+if path_norm_with_sep.startswith(root_norm_with_sep):
+    if path_str.startswith(src_root_str.rstrip(os.sep) + os.sep):
+        relative_str = path_str[len(src_root_str.rstrip(os.sep) + os.sep) :]
         relative_path = Path(relative_str)
     else:
-        # Fallback: use normalized paths
-        relative_str = path_normalized[len(src_root_normalized) :].lstrip(os.sep)
+        relative_str = path_norm_with_sep[len(root_norm_with_sep) :]
         relative_path = Path(relative_str)
Suggestion importance[1-10]: 8

__

Why: Ensuring a trailing separator when performing prefix checks avoids false positives like 'C:\proj' vs 'C:\projectX'; this is a precise fix in the new manual relative-path logic and meaningfully hardens path handling.

Medium
Normalize path separators too

Normalize separators as well to avoid false negatives when comparing paths that may
contain mixed slashes on Windows. Use os.path.normpath before os.path.normcase for
every path string involved in startswith checks.

codeflash/discovery/functions_to_optimize.py [675-684]

-tests_root_normalized = os.path.normcase(tests_root_str)
-module_root_normalized = os.path.normcase(module_root_str)
+tests_root_normalized = os.path.normcase(os.path.normpath(tests_root_str))
+module_root_normalized = os.path.normcase(os.path.normpath(module_root_str))
 ...
-file_path_normalized = os.path.normcase(file_path)
+file_path_normalized = os.path.normcase(os.path.normpath(file_path))
 if file_path_normalized.startswith(tests_root_normalized + os.sep):
Suggestion importance[1-10]: 7

__

Why: Using os.path.normpath before normcase is a correct, context-aware improvement that prevents mixed-separator false negatives on Windows; it directly applies to the new normalization and startswith checks around lines 675-684.

Medium
Fully normalize ignore path prefixes

Ensure every candidate prefix uses both normpath and normcase; otherwise mixed
separators in ignore_paths can break the prefix check. Apply the same normalization
used for file_path_normalized.

codeflash/discovery/functions_to_optimize.py [686-688]

 if file_path in ignore_paths or any(
-    file_path_normalized.startswith(os.path.normcase(str(ignore_path)) + os.sep) for ignore_path in ignore_paths
+    file_path_normalized.startswith(os.path.normcase(os.path.normpath(str(ignore_path))) + os.sep)
+    for ignore_path in ignore_paths
 ):
Suggestion importance[1-10]: 7

__

Why: Extending normalization to the ignore_paths prefixes aligns with the newly added file_path normalization and reduces Windows path mismatch risks, improving correctness with minimal change.

Medium

@mashraf-222 mashraf-222 requested a review from KRRT7 December 1, 2025 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants