Skip to content

Commit 574e1b5

Browse files
Merge branch 'main' of github.com:codeflash-ai/codeflash into init/install-vscode-extension
2 parents dc22b4a + 48e88b7 commit 574e1b5

33 files changed

+2822
-1209
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -254,3 +254,4 @@ fabric.properties
254254

255255
# Mac
256256
.DS_Store
257+
WARP.MD

AGENTS.md

Lines changed: 318 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,318 @@
1+
# CodeFlash AI Agent Instructions
2+
3+
This file provides comprehensive guidance to any coding agent (Warp, GitHub Copilot, Claude, Gemini, etc.) when working with the CodeFlash repository.
4+
5+
## Project Overview
6+
7+
CodeFlash is an AI-powered Python code optimizer that automatically improves code performance while maintaining correctness. It uses LLMs to analyze code, generate optimization ideas, validate correctness through comprehensive testing, benchmark performance improvements, and create merge-ready pull requests.
8+
9+
**Key Capabilities:**
10+
- Optimize entire codebases with `codeflash --all`
11+
- Optimize specific files or functions with targeted commands
12+
- End-to-end workflow optimization with `codeflash optimize script.py`
13+
- Automated GitHub Actions integration for CI/CD pipelines
14+
- Comprehensive benchmarking and performance analysis
15+
- Git worktree isolation for safe optimization
16+
17+
## Core Architecture
18+
19+
### Data Flow Pipeline
20+
Discovery → Context → Optimization → Verification → Benchmarking → PR
21+
22+
1. **Discovery** (`codeflash/discovery/`) - Find optimizable functions via static analysis or execution tracing
23+
2. **Context Extraction** (`codeflash/context/`) - Extract dependencies, imports, and related code
24+
3. **Optimization** (`codeflash/optimization/`) - Generate optimized code via AI service calls
25+
4. **Verification** (`codeflash/verification/`) - Run deterministic tests with custom pytest plugin
26+
5. **Benchmarking** (`codeflash/benchmarking/`) - Performance measurement and comparison
27+
6. **GitHub Integration** (`codeflash/github/`) - Automated PR creation with detailed analysis
28+
29+
### Key Components
30+
31+
**Main Entry Points:**
32+
- `codeflash/main.py` - CLI entry point and main orchestration
33+
- `codeflash/cli_cmds/cli.py` - Command-line argument parsing and validation
34+
35+
**Core Optimization Pipeline:**
36+
- `codeflash/optimization/optimizer.py` - Main optimization orchestrator
37+
- `codeflash/optimization/function_optimizer.py` - Individual function optimization
38+
- `codeflash/tracing/` - Function call tracing and profiling
39+
40+
**Code Analysis & Manipulation:**
41+
- `codeflash/code_utils/` - Code parsing, AST manipulation, static analysis
42+
- `codeflash/context/` - Code context extraction and analysis
43+
- `codeflash/verification/` - Code correctness verification through testing
44+
45+
**External Integrations:**
46+
- `codeflash/api/aiservice.py` - LLM communication with rate limiting and retries
47+
- `codeflash/github/` - GitHub integration for PR creation
48+
- `codeflash/benchmarking/` - Performance benchmarking and measurement
49+
50+
**Supporting Systems:**
51+
- `codeflash/models/models.py` - Pydantic models and type definitions
52+
- `codeflash/telemetry/` - Usage analytics (PostHog) and error reporting (Sentry)
53+
- `codeflash/ui/` - User interface components (Rich console output)
54+
- `codeflash/lsp/` - Language Server Protocol support for IDE integration
55+
56+
### Key Optimization Workflows
57+
58+
**1. Full Codebase Optimization (`--all`)**
59+
- Discovers all optimizable functions in the project
60+
- Runs benchmarks if configured
61+
- Optimizes functions in parallel
62+
- Creates PRs for successful optimizations
63+
64+
**2. Targeted Optimization (`--file`, `--function`)**
65+
- Focuses on specific files or functions
66+
- Performs detailed analysis and context extraction
67+
- Applies targeted optimizations
68+
69+
**3. Workflow Tracing (`optimize`)**
70+
- Traces Python script execution
71+
- Identifies performance bottlenecks
72+
- Generates optimizations for traced functions
73+
- Uses checkpoint system to resume interrupted runs
74+
75+
## Critical Development Patterns
76+
77+
### Package Management with uv (NOT pip)
78+
```bash
79+
# Always use uv, never pip
80+
uv sync # Install dependencies
81+
uv sync --group dev # Install dev dependencies
82+
uv run pytest # Run commands
83+
uv add package # Add new packages
84+
uv build # Build package
85+
```
86+
87+
### Code Manipulation with LibCST (NOT ast)
88+
Always use `libcst` for code parsing/modification to preserve formatting:
89+
```python
90+
from libcst import parse_module, PartialPythonCodeGen
91+
# Never use ast module for code transformations
92+
```
93+
94+
### Testing with Deterministic Execution
95+
Custom pytest plugin (`codeflash/verification/pytest_plugin.py`) ensures reproducible tests:
96+
- Patches time, random, uuid for deterministic behavior
97+
- Environment variables: `CODEFLASH_TEST_MODULE`, `CODEFLASH_TEST_CLASS`, `CODEFLASH_TEST_FUNCTION`
98+
- Always use `uv run pytest`, never `python -m pytest`
99+
100+
### Git Worktree Isolation
101+
Optimizations run in isolated git worktrees to avoid affecting main repo:
102+
```python
103+
from codeflash.code_utils.git_utils import create_detached_worktree, remove_worktree
104+
# Pattern: create_detached_worktree() → optimize → create_diff_patch_from_worktree()
105+
```
106+
107+
### Error Handling with Either Pattern
108+
Use functional error handling instead of exceptions:
109+
```python
110+
from codeflash.either import is_successful, Either
111+
result = aiservice_client.call_llm(...)
112+
if is_successful(result):
113+
optimized_code = result.value
114+
else:
115+
error = result.error
116+
```
117+
118+
## Configuration
119+
120+
All configuration in `pyproject.toml` under `[tool.codeflash]`:
121+
```toml
122+
[tool.codeflash]
123+
module-root = "codeflash" # Source code location
124+
tests-root = "tests" # Test directory
125+
benchmarks-root = "tests/benchmarks" # Benchmark tests
126+
test-framework = "pytest" # Always pytest
127+
formatter-cmds = [ # Auto-formatting commands
128+
"uvx ruff check --exit-zero --fix $file",
129+
"uvx ruff format $file",
130+
]
131+
```
132+
133+
## Development Commands
134+
135+
### Environment Setup
136+
```bash
137+
# Install dependencies (always use uv)
138+
uv sync
139+
140+
# Install development dependencies
141+
uv sync --group dev
142+
143+
# Install pre-commit hooks
144+
uv run pre-commit install
145+
```
146+
147+
### Code Quality & Linting
148+
```bash
149+
# Run linting and formatting with ruff (primary tool)
150+
uv run ruff check --fix .
151+
uv run ruff format .
152+
153+
# Type checking with mypy (strict mode)
154+
uv run mypy .
155+
156+
# Clean Python cache files
157+
uvx pyclean .
158+
```
159+
160+
### Testing
161+
```bash
162+
# Run all tests
163+
uv run pytest
164+
165+
# Run tests with coverage
166+
uv run coverage run -m pytest tests/
167+
168+
# Run specific test file
169+
uv run pytest tests/test_code_utils.py
170+
171+
# Run tests with verbose output
172+
uv run pytest -v
173+
174+
# Run benchmarks
175+
uv run pytest tests/benchmarks/
176+
177+
# Run end-to-end tests
178+
uv run pytest tests/scripts/
179+
180+
# Run with specific markers
181+
uv run pytest -m "not ci_skip"
182+
```
183+
184+
### Running CodeFlash
185+
```bash
186+
# Initialize CodeFlash in a project
187+
uv run codeflash init
188+
189+
# Optimize entire codebase
190+
uv run codeflash --all
191+
192+
# Optimize specific file
193+
uv run codeflash --file path/to/file.py
194+
195+
# Optimize specific function
196+
uv run codeflash --file path/to/file.py --function function_name
197+
198+
# Trace and optimize a workflow
199+
uv run codeflash optimize script.py
200+
201+
# Verify setup with test optimization
202+
uv run codeflash --verify-setup
203+
204+
# Run with verbose logging
205+
uv run codeflash --verbose --all
206+
207+
# Run with benchmarking enabled
208+
uv run codeflash --benchmark --file target_file.py
209+
210+
# Use replay tests for debugging
211+
uv run codeflash --replay-test tests/specific_test.py
212+
```
213+
214+
## Development Guidelines
215+
216+
### Code Style
217+
- Uses Ruff for linting and formatting (configured in pyproject.toml)
218+
- Strict mypy type checking enabled
219+
- Pre-commit hooks enforce code quality
220+
- Line length: 120 characters
221+
- Python 3.10+ syntax
222+
223+
### Testing Strategy
224+
- Primary test framework: pytest
225+
- Tests located in `tests/` directory
226+
- End-to-end tests in `tests/scripts/`
227+
- Benchmarks in `tests/benchmarks/`
228+
- Extensive use of `@pytest.mark.parametrize`
229+
- Shared fixtures in conftest.py
230+
- Test isolation via custom pytest plugin
231+
232+
### Key Dependencies
233+
- **Core**: `libcst`, `jedi`, `gitpython`, `pydantic`
234+
- **Testing**: `pytest`, `coverage`, `crosshair-tool`
235+
- **Performance**: `line_profiler`, `timeout-decorator`
236+
- **UI**: `rich`, `inquirer`, `click`
237+
- **AI**: Custom API client for LLM interactions
238+
239+
### Data Models & Types
240+
- `codeflash/models/models.py` - Pydantic models for all data structures
241+
- Extensive use of `@dataclass(frozen=True)` for immutable data
242+
- Core types: `FunctionToOptimize`, `ValidCode`, `BenchmarkKey`
243+
244+
## AI Service Integration
245+
246+
### Rate Limiting & Retries
247+
- Built-in rate limiting and exponential backoff
248+
- Handle `Either` return types for error handling
249+
- AI service endpoint: `codeflash/api/aiservice.py`
250+
251+
### Telemetry & Monitoring
252+
- **Sentry**: Error tracking with `codeflash.telemetry.sentry`
253+
- **PostHog**: Usage analytics with `codeflash.telemetry.posthog_cf`
254+
- **Environment Variables**: `CODEFLASH_EXPERIMENT_ID` for testing modes
255+
256+
## Performance & Benchmarking
257+
258+
### Line Profiler Integration
259+
- Uses `line_profiler` for detailed performance analysis
260+
- Instruments functions with `@profile` decorator
261+
- Generates before/after profiling reports
262+
- Calculates precise speedup measurements
263+
264+
### Benchmark Test Framework
265+
- Custom benchmarking in `tests/benchmarks/`
266+
- Generates replay tests from execution traces
267+
- Validates performance improvements statistically
268+
269+
## Debugging & Development
270+
271+
### Verbose Logging
272+
```bash
273+
uv run codeflash --verbose --file target_file.py
274+
```
275+
276+
### Important Environment Variables
277+
- `CODEFLASH_TEST_MODULE` - Current test module during verification
278+
- `CODEFLASH_TEST_CLASS` - Current test class during verification
279+
- `CODEFLASH_TEST_FUNCTION` - Current test function during verification
280+
- `CODEFLASH_LOOP_INDEX` - Current iteration in pytest loops
281+
- `CODEFLASH_EXPERIMENT_ID` - Enables local AI service for testing
282+
283+
### LSP Integration
284+
Language Server Protocol support in `codeflash/lsp/` enables IDE integration during optimization.
285+
286+
### Common Debugging Patterns
287+
1. Use verbose logging to trace optimization flow
288+
2. Check git worktree operations for isolation issues
289+
3. Verify deterministic test execution with environment variables
290+
4. Use replay tests to debug specific optimization scenarios
291+
5. Monitor AI service calls with rate limiting logs
292+
293+
## Best Practices
294+
295+
### Path Handling
296+
- Always use absolute paths
297+
- Handle encoding explicitly (UTF-8)
298+
- Extensive path validation and cleanup utilities in `codeflash/code_utils/`
299+
300+
### Git Operations
301+
- All optimizations run in isolated worktrees
302+
- Never modify the main repository directly
303+
- Use git utilities in `codeflash/code_utils/git_utils.py`
304+
305+
### Code Transformations
306+
- Always use libcst, never ast module
307+
- Preserve code formatting and comments
308+
- Validate transformations with deterministic tests
309+
310+
### Error Handling
311+
- Use Either pattern for functional error handling
312+
- Log errors to Sentry for monitoring
313+
- Provide clear user feedback via Rich console
314+
315+
### Performance Optimization
316+
- Profile before and after changes
317+
- Use benchmarks to validate improvements
318+
- Generate detailed performance reports

0 commit comments

Comments
 (0)