-
Notifications
You must be signed in to change notification settings - Fork 2
feat: implement LRU cache and sophisticated token counting (issues #4 and #5) #127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…and #5) - add lrucache class with ttl support and statistics tracking - add tokencounter class with google ai api integration - implement content-type aware token estimation (code/json/markdown/text) - integrate lru caching for token counts (200 entries, 30min ttl) - add automatic eviction and periodic cleanup for cache - initialize global tokencounter singleton with api key from environment implements issue #4: sophisticated token counting beyond character/4 implements issue #5: lru cache for expensive operations generated with claude code co-authored-by: claude <noreply@anthropic.com>
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Summary by CodeRabbitRelease Notes
WalkthroughAdds LRU caching and token-counting infrastructure to the PowerShell orchestrator: new classes LruCacheEntry, LruCache, and TokenCounter; deterministic cache-key generation; TTL and eviction handling; API-based token counting with estimation fallback; and a singleton TokenCounter initialized from Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant TokenCounter
participant Cache as LruCache
participant API as Google API
participant Estimator
Caller->>TokenCounter: CountTokens(text, contentType)
TokenCounter->>TokenCounter: Generate deterministic key
TokenCounter->>Cache: Get(key)
alt Cache hit
Cache-->>TokenCounter: cached count
TokenCounter->>TokenCounter: Increment CacheHitCount
TokenCounter-->>Caller: Return count
else Cache miss
TokenCounter->>TokenCounter: Increment MissCount
alt API key present
TokenCounter->>API: CountTokensViaAPI(text)
alt API success
API-->>TokenCounter: token count
TokenCounter->>TokenCounter: Increment ApiCallCount
else API failure
API-->>TokenCounter: Error
TokenCounter->>Estimator: EstimateTokens(text, contentType)
end
else No API key
TokenCounter->>Estimator: EstimateTokens(text, contentType)
TokenCounter->>TokenCounter: Increment EstimationCount
end
TokenCounter->>Cache: Set(key, count)
TokenCounter-->>Caller: Return count
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Suggested labels
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
Performance Benchmark Results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements an LRU (Least Recently Used) cache and token counting functionality to optimize token usage in the token-optimizer-orchestrator. The changes address Issues #4 and #5 by adding intelligent caching with TTL (Time To Live) support and API-backed token counting with fallback to estimation.
- Added
LruCacheandLruCacheEntryclasses with eviction, TTL expiration, and statistics tracking - Implemented
TokenCounterclass with Google AI API integration, LRU caching, and content-type-aware estimation - Initialized a global singleton
TokenCounterinstance for use throughout the script
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
hooks/handlers/token-optimizer-orchestrator.ps1(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Integration Tests
- GitHub Check: Test (Node 20)
- GitHub Check: Test (Node 18)
- add type guards to prevent class re-definition errors (CRITICAL) - fix sha256 resource disposal with proper try/finally - replace write-log with write-host to fix ordering issue - fix double-counting in getstats totalcalls calculation - make model name configurable via google_ai_model env var - improve api error handling for timeout/network errors - fix detectcontenttype regex for exact matching addresses feedback from github copilot and coderabbit reviews
Commit Message Format IssueYour commit messages don't follow the Conventional Commits specification. Required Format:Valid Types:
Examples:Breaking Changes:Add Please amend your commit messages to follow this format. Learn more: Conventional Commits |
1 similar comment
Commit Message Format IssueYour commit messages don't follow the Conventional Commits specification. Required Format:Valid Types:
Examples:Breaking Changes:Add Please amend your commit messages to follow this format. Learn more: Conventional Commits |
Performance Benchmark Results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
hooks/handlers/token-optimizer-orchestrator.ps1 (6)
382-387: Dispose SHA256 hasher to prevent resource leak.The SHA256 hasher created at line 383 is never disposed. This can lead to resource leaks over time.
Apply this diff to ensure proper disposal:
# Hash large content instead of embedding (prevents unique keys for every variation) if ($value -is [string] -and $value.Length -gt 1000) { $hasher = [System.Security.Cryptography.SHA256]::Create() - $bytes = [System.Text.Encoding]::UTF8.GetBytes($value) - $hashBytes = $hasher.ComputeHash($bytes) - $value = $script:HASH_PREFIX + [Convert]::ToBase64String($hashBytes).Substring(0, $script:HASH_LENGTH) + try { + $bytes = [System.Text.Encoding]::UTF8.GetBytes($value) + $hashBytes = $hasher.ComputeHash($bytes) + $value = $script:HASH_PREFIX + [Convert]::ToBase64String($hashBytes).Substring(0, $script:HASH_LENGTH) + } finally { + $hasher.Dispose() + } }
395-399: Dispose SHA256 hasher to prevent resource leak.The SHA256 hasher created at line 396 is never disposed. This can lead to resource leaks over time.
Apply this diff to ensure proper disposal:
# Hash the entire key for fixed length (prevents extremely long keys) $hasher = [System.Security.Cryptography.SHA256]::Create() - $keyBytes = [System.Text.Encoding]::UTF8.GetBytes($json) - $hashBytes = $hasher.ComputeHash($keyBytes) - return [Convert]::ToBase64String($hashBytes).Substring(0, $script:HASH_LENGTH) + try { + $keyBytes = [System.Text.Encoding]::UTF8.GetBytes($json) + $hashBytes = $hasher.ComputeHash($keyBytes) + return [Convert]::ToBase64String($hashBytes).Substring(0, $script:HASH_LENGTH) + } finally { + $hasher.Dispose() + } }
980-982: Dispose SHA256 hasher to prevent resource leak.The SHA256 hasher created at line 980 is never disposed.
Apply this diff:
# PHASE 2 FIX: Use content hash instead of timestamp for cache key $hasher = [System.Security.Cryptography.SHA256]::Create() - $hashBytes = $hasher.ComputeHash([System.Text.Encoding]::UTF8.GetBytes($userPrompt)) - $contentHash = [Convert]::ToBase64String($hashBytes).Substring(0, 16) + try { + $hashBytes = $hasher.ComputeHash([System.Text.Encoding]::UTF8.GetBytes($userPrompt)) + $contentHash = [Convert]::ToBase64String($hashBytes).Substring(0, 16) + } finally { + $hasher.Dispose() + }
1827-1829: Dispose SHA256 hasher to prevent resource leak.The SHA256 hasher created at line 1827 is never disposed.
Apply this diff:
# PHASE 2 FIX: Use content hash instead of timestamp for cache key $hasher = [System.Security.Cryptography.SHA256]::Create() - $hashBytes = $hasher.ComputeHash([System.Text.Encoding]::UTF8.GetBytes($argsJson)) - $contentHash = [Convert]::ToBase64String($hashBytes).Substring(0, 16) + try { + $hashBytes = $hasher.ComputeHash([System.Text.Encoding]::UTF8.GetBytes($argsJson)) + $contentHash = [Convert]::ToBase64String($hashBytes).Substring(0, 16) + } finally { + $hasher.Dispose() + }
1964-1966: Dispose SHA256 hasher to prevent resource leak.The SHA256 hasher created at line 1964 is never disposed.
Apply this diff:
# PHASE 2 FIX: Use content hash instead of timestamp for cache key $hasher = [System.Security.Cryptography.SHA256]::Create() - $hashBytes = $hasher.ComputeHash([System.Text.Encoding]::UTF8.GetBytes($outputText)) - $contentHash = [Convert]::ToBase64String($hashBytes).Substring(0, 16) + try { + $hashBytes = $hasher.ComputeHash([System.Text.Encoding]::UTF8.GetBytes($outputText)) + $contentHash = [Convert]::ToBase64String($hashBytes).Substring(0, 16) + } finally { + $hasher.Dispose() + }
2101-2103: Dispose SHA256 hasher to prevent resource leak.The SHA256 hasher created at line 2101 is never disposed.
Apply this diff:
# PHASE 2 FIX: Use content hash instead of timestamp for cache key $hasher = [System.Security.Cryptography.SHA256]::Create() - $hashBytes = $hasher.ComputeHash([System.Text.Encoding]::UTF8.GetBytes($contextText)) - $contentHash = [Convert]::ToBase64String($hashBytes).Substring(0, 16) + try { + $hashBytes = $hasher.ComputeHash([System.Text.Encoding]::UTF8.GetBytes($contextText)) + $contentHash = [Convert]::ToBase64String($hashBytes).Substring(0, 16) + } finally { + $hasher.Dispose() + }
♻️ Duplicate comments (1)
hooks/handlers/token-optimizer-orchestrator.ps1 (1)
329-340: totalCalls calculation appears correct.Past review comments flagged that
$totalCallsdouble-counts cache hits, but the current implementation at line 331 correctly calculates$totalCalls = $this.ApiCallCount + $this.EstimationCount, excludingCacheHitCount. Cache hits are tracked separately and should not be counted as API calls or estimations. The logic is correct.
🧹 Nitpick comments (1)
hooks/handlers/token-optimizer-orchestrator.ps1 (1)
169-190: Consider distinguishing TTL expiration from capacity eviction in statistics.The
CleanupExpiredmethod incrementsEvictionCountfor TTL-based removals (line 188). This conflates two different cache behaviors: capacity-based eviction (line 133) and time-based expiration. Consider tracking these separately for clearer analytics, or document thatEvictionCountincludes both types.If you want to track them separately, add a property
[int]$ExpirationCountand update the logic:+ [int]$ExpirationCount = 0 ... [hashtable] GetStats() { $totalRequests = $this.HitCount + $this.MissCount return @{ Size = $this.Cache.Count MaxSize = $this.MaxSize HitCount = $this.HitCount MissCount = $this.MissCount EvictionCount = $this.EvictionCount + ExpirationCount = $this.ExpirationCount HitRate = if ($totalRequests -gt 0) { [Math]::Round(($this.HitCount / $totalRequests) * 100, 2) } else { 0 } } } ... [int] CleanupExpired() { ... foreach ($key in $keysToRemove) { $this.Cache.Remove($key) $removed++ } - $this.EvictionCount += $removed + $this.ExpirationCount += $removed return $removed }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
hooks/handlers/token-optimizer-orchestrator.ps1(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Test (Node 18)
- GitHub Check: Test (Node 20)
🔇 Additional comments (3)
hooks/handlers/token-optimizer-orchestrator.ps1 (3)
61-75: LGTM! Class guard and LruCacheEntry implementation look correct.The type-checking guard using
-as [type]properly prevents class re-definition errors on subsequent script loads. The LruCacheEntry class is a simple, correct wrapper for cache entries with timestamps.
344-352: Singleton initialization looks good.The singleton pattern with environment variable support is correctly implemented. The model name is now configurable via
GOOGLE_AI_MODELenvironment variable (addressing a past comment), and a warning is logged when the API key is missing.
214-252: Cache key design is correct—no changes needed.The cache key format (
"${contentType}:${textHash}") is both deterministic (SHA-256 produces consistent hashes) and intentionally designed. Including the content type is necessary because different content types require different token estimation multipliers (as applied inEstimateTokens), so caching the same text separately per content type prevents returning incorrect estimates.
| $value = $entry.Value | ||
| $this.Cache.Remove($key) | ||
| $this.Cache[$key] = [LruCacheEntry]::new($value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve original timestamp when moving entry to end.
When promoting an entry to the most-recently-used position, creating a new LruCacheEntry resets the timestamp to the current time. This incorrectly extends the TTL on every access, meaning cached entries will never expire if accessed frequently enough.
Apply this diff to preserve the original timestamp:
# Move to end (most recently used) by removing and re-adding
$value = $entry.Value
+ $originalTimestamp = $entry.Timestamp
$this.Cache.Remove($key)
- $this.Cache[$key] = [LruCacheEntry]::new($value)
+ $newEntry = [LruCacheEntry]::new($value)
+ $newEntry.Timestamp = $originalTimestamp
+ $this.Cache[$key] = $newEntry📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| $value = $entry.Value | |
| $this.Cache.Remove($key) | |
| $this.Cache[$key] = [LruCacheEntry]::new($value) | |
| $value = $entry.Value | |
| $originalTimestamp = $entry.Timestamp | |
| $this.Cache.Remove($key) | |
| $newEntry = [LruCacheEntry]::new($value) | |
| $newEntry.Timestamp = $originalTimestamp | |
| $this.Cache[$key] = $newEntry |
🤖 Prompt for AI Agents
In hooks/handlers/token-optimizer-orchestrator.ps1 around lines 113 to 115, the
code creates a new LruCacheEntry when moving an item to the MRU position which
resets its timestamp and unintentionally extends TTL on access; instead preserve
the original timestamp by either moving the existing entry object without
reinitializing it or, if a new LruCacheEntry must be created, copy the original
entry's timestamp into the new object (e.g., capture $entry.Timestamp and pass
or set it on the new instance) so accesses do not refresh expiry.
| [bool] ContainsKey([string]$key) { | ||
| return $null -ne $this.Get($key) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ContainsKey has unexpected side effects.
Calling Get() from within ContainsKey() increments hit/miss counters and repositions the entry in the LRU order. This is unexpected behavior for a "check if key exists" method, which should be read-only.
Apply this diff to implement a side-effect-free check:
# Check if key exists and is not expired
[bool] ContainsKey([string]$key) {
- return $null -ne $this.Get($key)
+ if (-not $this.Cache.Contains($key)) {
+ return $false
+ }
+ # Check TTL without updating counters or position
+ $entry = $this.Cache[$key]
+ if ($this.TtlSeconds -gt 0) {
+ $age = ((Get-Date) - $entry.Timestamp).TotalSeconds
+ if ($age -gt $this.TtlSeconds) {
+ return $false
+ }
+ }
+ return $true
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| [bool] ContainsKey([string]$key) { | |
| return $null -ne $this.Get($key) | |
| } | |
| [bool] ContainsKey([string]$key) { | |
| if (-not $this.Cache.Contains($key)) { | |
| return $false | |
| } | |
| # Check TTL without updating counters or position | |
| $entry = $this.Cache[$key] | |
| if ($this.TtlSeconds -gt 0) { | |
| $age = ((Get-Date) - $entry.Timestamp).TotalSeconds | |
| if ($age -gt $this.TtlSeconds) { | |
| return $false | |
| } | |
| } | |
| return $true | |
| } |
🤖 Prompt for AI Agents
In hooks/handlers/token-optimizer-orchestrator.ps1 around lines 141 to 143,
ContainsKey currently calls Get which increments hit/miss counters and mutates
LRU state; replace this by inspecting the cache's internal storage directly
(e.g. the backing hashtable/dictionary or entries collection) to determine
presence without calling Get, avoid updating hit/miss counters or touching LRU
position, and ensure you still respect entry expiry by checking the stored
entry's expiry timestamp without performing any state mutations.
| [int] CountTokensViaAPI([string]$text) { | ||
| $requestBody = @{ | ||
| contents = @( | ||
| @{ | ||
| parts = @( | ||
| @{ | ||
| text = $text | ||
| } | ||
| ) | ||
| } | ||
| ) | ||
| } | ConvertTo-Json -Depth 10 -Compress | ||
|
|
||
| $uri = "https://generativelanguage.googleapis.com/v1beta/models/$($this.Model):countTokens?key=$($this.ApiKey)" | ||
|
|
||
| try { | ||
| $response = Invoke-RestMethod -Uri $uri -Method POST -ContentType "application/json" -Body $requestBody -TimeoutSec 5 | ||
| } catch { | ||
| $ex = $_.Exception | ||
| if ($ex -is [System.Net.WebException]) { | ||
| if ($ex.Status -eq [System.Net.WebExceptionStatus]::Timeout) { | ||
| throw "Token counting API timeout after 5 seconds" | ||
| } elseif ($ex.Status -eq [System.Net.WebExceptionStatus]::ConnectFailure) { | ||
| throw "Token counting API network error (connect failure)" | ||
| } else { | ||
| throw "Token counting API network error: $($ex.Status)" | ||
| } | ||
| } else { | ||
| throw | ||
| } | ||
| } | ||
|
|
||
| return $response.totalTokens | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add null check for API response structure.
Line 287 returns $response.totalTokens without verifying that the response contains this property. If the API returns an unexpected format or error response, this will fail with a property access error.
Apply this diff to add defensive null checking:
+ if (-not $response -or -not $response.PSObject.Properties['totalTokens']) {
+ throw "API response missing 'totalTokens' property"
+ }
return $response.totalTokens📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| [int] CountTokensViaAPI([string]$text) { | |
| $requestBody = @{ | |
| contents = @( | |
| @{ | |
| parts = @( | |
| @{ | |
| text = $text | |
| } | |
| ) | |
| } | |
| ) | |
| } | ConvertTo-Json -Depth 10 -Compress | |
| $uri = "https://generativelanguage.googleapis.com/v1beta/models/$($this.Model):countTokens?key=$($this.ApiKey)" | |
| try { | |
| $response = Invoke-RestMethod -Uri $uri -Method POST -ContentType "application/json" -Body $requestBody -TimeoutSec 5 | |
| } catch { | |
| $ex = $_.Exception | |
| if ($ex -is [System.Net.WebException]) { | |
| if ($ex.Status -eq [System.Net.WebExceptionStatus]::Timeout) { | |
| throw "Token counting API timeout after 5 seconds" | |
| } elseif ($ex.Status -eq [System.Net.WebExceptionStatus]::ConnectFailure) { | |
| throw "Token counting API network error (connect failure)" | |
| } else { | |
| throw "Token counting API network error: $($ex.Status)" | |
| } | |
| } else { | |
| throw | |
| } | |
| } | |
| return $response.totalTokens | |
| } | |
| [int] CountTokensViaAPI([string]$text) { | |
| $requestBody = @{ | |
| contents = @( | |
| @{ | |
| parts = @( | |
| @{ | |
| text = $text | |
| } | |
| ) | |
| } | |
| ) | |
| } | ConvertTo-Json -Depth 10 -Compress | |
| $uri = "https://generativelanguage.googleapis.com/v1beta/models/$($this.Model):countTokens?key=$($this.ApiKey)" | |
| try { | |
| $response = Invoke-RestMethod -Uri $uri -Method POST -ContentType "application/json" -Body $requestBody -TimeoutSec 5 | |
| } catch { | |
| $ex = $_.Exception | |
| if ($ex -is [System.Net.WebException]) { | |
| if ($ex.Status -eq [System.Net.WebExceptionStatus]::Timeout) { | |
| throw "Token counting API timeout after 5 seconds" | |
| } elseif ($ex.Status -eq [System.Net.WebExceptionStatus]::ConnectFailure) { | |
| throw "Token counting API network error (connect failure)" | |
| } else { | |
| throw "Token counting API network error: $($ex.Status)" | |
| } | |
| } else { | |
| throw | |
| } | |
| } | |
| if (-not $response -or -not $response.PSObject.Properties['totalTokens']) { | |
| throw "API response missing 'totalTokens' property" | |
| } | |
| return $response.totalTokens | |
| } |
🤖 Prompt for AI Agents
In hooks/handlers/token-optimizer-orchestrator.ps1 around lines 255 to 288, the
function returns $response.totalTokens without validating the API response; add
defensive null checks after the Invoke-RestMethod call to ensure $response is
not $null and that $response.totalTokens exists and is an integer (or numeric);
if the check fails, throw a clear, descriptive exception (include a short
serialization of $response for debugging) so callers get a useful error instead
of a property-access failure.
Summary
Implements two critical token optimization features:
Issue #4 - Sophisticated Token Counting:
generativelanguage.googleapis.com/v1beta/models/{model}:countTokens)$script:TokenCounterIssue #5 - LRU Cache for Expensive Operations:
Changes
LruCacheEntryclass (lines 61-70)LruCacheclass with full LRU implementation (lines 72-187)TokenCounterclass with Google AI API integration (lines 189-315)$script:TokenCountersingletonhooks/handlers/token-optimizer-orchestrator.ps1Test Plan
GOOGLE_AI_API_KEYExpected Impact
Closes #4
Closes #5
🤖 Generated with Claude Code