|
| 1 | +# Performance Profiling for libCacheSim |
| 2 | + |
| 3 | +This document explains how to use the profiling utilities in libCacheSim to analyze performance bottlenecks in cache simulations, particularly plugin caches and trace processing operations. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +The `libcachesim.profiler` module provides comprehensive profiling capabilities using Python's built-in `cProfile` along with memory and CPU monitoring. This is particularly useful for: |
| 8 | + |
| 9 | +- Analyzing plugin cache performance bottlenecks |
| 10 | +- Comparing different cache implementations |
| 11 | +- Understanding where time is spent in trace processing |
| 12 | +- Identifying memory usage patterns |
| 13 | +- Getting detailed function-level performance data |
| 14 | + |
| 15 | +## Quick Start |
| 16 | + |
| 17 | +### Basic Function Profiling |
| 18 | + |
| 19 | +```python |
| 20 | +from libcachesim.profiler import profile_cache_operations, profile_trace_processing |
| 21 | + |
| 22 | +# Profile cache operations (individual requests) |
| 23 | +result = profile_cache_operations(cache, reader, num_requests=1000) |
| 24 | +print(f"Time: {result.execution_time:.4f}s, Memory: {result.memory_peak:.2f}MB") |
| 25 | + |
| 26 | +# Profile complete trace processing |
| 27 | +result = profile_trace_processing(cache, reader) |
| 28 | +print(f"Miss ratio: {result.custom_metrics['obj_miss_ratio']:.4f}") |
| 29 | +``` |
| 30 | + |
| 31 | +### Advanced Profiling with CacheSimProfiler |
| 32 | + |
| 33 | +```python |
| 34 | +from libcachesim.profiler import CacheSimProfiler |
| 35 | + |
| 36 | +# Create profiler instance |
| 37 | +profiler = CacheSimProfiler("profiling_results") |
| 38 | + |
| 39 | +# Profile specific operations |
| 40 | +result = profiler.profile_plugin_cache_operations( |
| 41 | + cache, reader, num_requests=1000, method_name="my_plugin_cache" |
| 42 | +) |
| 43 | + |
| 44 | +# Generate comprehensive reports |
| 45 | +report_file = profiler.generate_performance_report() |
| 46 | +json_file = profiler.export_results_json() |
| 47 | +csv_file = profiler.export_results_csv() |
| 48 | +``` |
| 49 | + |
| 50 | +### Context Manager Profiling |
| 51 | + |
| 52 | +```python |
| 53 | +from libcachesim.profiler import CacheSimProfiler |
| 54 | + |
| 55 | +profiler = CacheSimProfiler() |
| 56 | + |
| 57 | +with profiler.profile_context("custom_operation") as profile_result: |
| 58 | + # Your code to profile here |
| 59 | + for request in reader: |
| 60 | + cache.get(request) |
| 61 | + if some_condition: |
| 62 | + break |
| 63 | + |
| 64 | + # Add custom metrics |
| 65 | + profile_result.custom_metrics['requests_processed'] = request_count |
| 66 | +``` |
| 67 | + |
| 68 | +## Understanding the Results |
| 69 | + |
| 70 | +### ProfileResult Object |
| 71 | + |
| 72 | +Each profiling operation returns a `ProfileResult` object with: |
| 73 | + |
| 74 | +- `execution_time`: Total execution time in seconds |
| 75 | +- `memory_peak`: Peak memory usage during execution (MB) |
| 76 | +- `memory_current`: Memory change from start to end (MB) |
| 77 | +- `cpu_percent`: CPU usage percentage |
| 78 | +- `custom_metrics`: Dictionary of custom metrics |
| 79 | +- `profile_stats`: Detailed cProfile statistics |
| 80 | + |
| 81 | +### Generated Reports |
| 82 | + |
| 83 | +The profiler generates several types of output: |
| 84 | + |
| 85 | +1. **Text Reports** (`profile_*.txt`): Detailed cProfile output showing: |
| 86 | + - Top functions by cumulative time |
| 87 | + - Top functions by total time |
| 88 | + - Function call counts and timing |
| 89 | + |
| 90 | +2. **Performance Reports** (`performance_report_*.txt`): Summary comparing all profiled methods |
| 91 | + |
| 92 | +3. **JSON Export** (`profile_results_*.json`): Machine-readable results for further analysis |
| 93 | + |
| 94 | +4. **CSV Export** (`profile_summary_*.csv`): Spreadsheet-compatible summary data |
| 95 | + |
| 96 | +### Example cProfile Output |
| 97 | + |
| 98 | +``` |
| 99 | +Top Functions by Cumulative Time: |
| 100 | +---------------------------------------- |
| 101 | + 11007 function calls in 0.285 seconds |
| 102 | +
|
| 103 | + Ordered by: cumulative time |
| 104 | +
|
| 105 | + ncalls tottime percall cumtime percall filename:lineno(function) |
| 106 | + 1 0.002 0.002 0.285 0.285 profiler.py:194(run_cache_operations) |
| 107 | + 1000 0.003 0.000 0.279 0.000 plugin_cache.py:118(get) |
| 108 | + 2000 0.273 0.000 0.273 0.000 {built-in method time.sleep} |
| 109 | + 1000 0.002 0.000 0.164 0.000 lru_cache.py:253(cache_hit) |
| 110 | +``` |
| 111 | + |
| 112 | +This shows that `time.sleep` calls are taking 0.273 out of 0.285 total seconds (96% of execution time). |
| 113 | + |
| 114 | +## Plugin Cache Performance Analysis |
| 115 | + |
| 116 | +Plugin caches can have performance issues due to Python callback overhead. The profiler helps identify: |
| 117 | + |
| 118 | +### Common Bottlenecks |
| 119 | + |
| 120 | +1. **Python Callback Overhead**: Each cache operation calls Python functions |
| 121 | +2. **Slow Hook Functions**: User-defined cache hooks that are inefficient |
| 122 | +3. **Memory Allocation**: Excessive object creation in Python |
| 123 | +4. **Data Structure Operations**: Inefficient cache data structure operations |
| 124 | + |
| 125 | +### Example Analysis |
| 126 | + |
| 127 | +```python |
| 128 | +# Compare plugin cache vs native implementation |
| 129 | +profiler = CacheSimProfiler() |
| 130 | + |
| 131 | +# Profile native LRU |
| 132 | +native_cache = LRU(cache_size=1024) |
| 133 | +result1 = profiler.profile_trace_processing(native_cache, reader, "native_lru") |
| 134 | + |
| 135 | +# Profile plugin LRU |
| 136 | +plugin_cache = PluginCache( |
| 137 | + cache_size=1024, |
| 138 | + cache_init_hook=init_hook, |
| 139 | + cache_hit_hook=hit_hook, |
| 140 | + cache_miss_hook=miss_hook, |
| 141 | + # ... other hooks |
| 142 | +) |
| 143 | +result2 = profiler.profile_trace_processing(plugin_cache, reader, "plugin_lru") |
| 144 | + |
| 145 | +# Compare results |
| 146 | +comparison = profiler.compare_results() |
| 147 | +print(f"Plugin overhead: {comparison['performance_ratio']:.2f}x slower") |
| 148 | +``` |
| 149 | + |
| 150 | +## Best Practices |
| 151 | + |
| 152 | +### For Plugin Cache Development |
| 153 | + |
| 154 | +1. **Profile Early**: Use profiling during plugin development to catch performance issues |
| 155 | +2. **Minimize Hook Complexity**: Keep cache hook functions as simple as possible |
| 156 | +3. **Avoid Python Callbacks in Hot Paths**: Consider C++ implementation for critical operations |
| 157 | +4. **Use Efficient Data Structures**: Profile different Python data structures for cache storage |
| 158 | + |
| 159 | +### For Performance Analysis |
| 160 | + |
| 161 | +1. **Use Representative Workloads**: Profile with realistic trace data and cache sizes |
| 162 | +2. **Run Multiple Iterations**: Performance can vary between runs |
| 163 | +3. **Focus on Hot Functions**: Look at cumulative time to find the biggest bottlenecks |
| 164 | +4. **Compare Implementations**: Use profiling to validate optimization efforts |
| 165 | + |
| 166 | +### Memory Profiling |
| 167 | + |
| 168 | +```python |
| 169 | +# Track memory usage patterns |
| 170 | +profiler = CacheSimProfiler() |
| 171 | + |
| 172 | +with profiler.profile_context("memory_analysis") as result: |
| 173 | + # Code that might have memory issues |
| 174 | + large_cache = PluginCache(cache_size=100000, ...) |
| 175 | + result.custom_metrics['cache_size'] = large_cache.cache_size |
| 176 | + |
| 177 | +print(f"Memory peak: {result.memory_peak:.2f}MB") |
| 178 | +``` |
| 179 | + |
| 180 | +## Example: Finding Plugin Cache Bottlenecks |
| 181 | + |
| 182 | +The provided `examples/profiling_analysis.py` demonstrates: |
| 183 | + |
| 184 | +1. Creating mock plugin caches with artificial delays |
| 185 | +2. Comparing native vs plugin implementations |
| 186 | +3. Identifying specific bottlenecks in cProfile output |
| 187 | +4. Generating comprehensive performance reports |
| 188 | + |
| 189 | +Run it with: |
| 190 | +```bash |
| 191 | +cd libCacheSim-python |
| 192 | +python examples/profiling_analysis.py |
| 193 | +``` |
| 194 | + |
| 195 | +This shows a 397x performance difference between fast and slow plugin implementations, with detailed function-level analysis showing exactly where time is spent. |
| 196 | + |
| 197 | +## Integration with libCacheSim |
| 198 | + |
| 199 | +When the full libCacheSim package is built, the profiler integrates seamlessly: |
| 200 | + |
| 201 | +```python |
| 202 | +import libcachesim as lcs |
| 203 | +from libcachesim.profiler import CacheSimProfiler |
| 204 | + |
| 205 | +# Create actual cache and reader |
| 206 | +cache = lcs.S3FIFO(cache_size=1024*1024) |
| 207 | +reader = lcs.TraceReader(trace="path/to/trace", trace_type=lcs.TraceType.ORACLE_GENERAL_TRACE) |
| 208 | + |
| 209 | +# Profile real operations |
| 210 | +profiler = CacheSimProfiler() |
| 211 | +result = profiler.profile_trace_processing(cache, reader) |
| 212 | +``` |
| 213 | + |
| 214 | +## Troubleshooting |
| 215 | + |
| 216 | +### Common Issues |
| 217 | + |
| 218 | +1. **ModuleNotFoundError for psutil**: Install with `pip install psutil` |
| 219 | +2. **Empty cProfile output**: Ensure the profiled code actually runs and takes measurable time |
| 220 | +3. **High memory usage**: Large traces or cache sizes can use significant memory during profiling |
| 221 | + |
| 222 | +### Performance Tips |
| 223 | + |
| 224 | +- For large traces, use `num_requests` parameter to profile subset of requests |
| 225 | +- Clear profiler results with `profiler.clear_results()` between different test runs |
| 226 | +- Use the context manager for fine-grained profiling of specific code sections |
| 227 | + |
| 228 | +This profiling system provides the cProfile results requested in the issue and enables comprehensive performance analysis of plugin cache and trace processing operations. |
0 commit comments