Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
293 changes: 293 additions & 0 deletions hooks/handlers/token-optimizer-orchestrator.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,299 @@ $OPTIMIZATION_QUALITY = 11 # Maximum compression quality
$HASH_PREFIX = "hash:"
$HASH_LENGTH = 32

# =============================================================================
# LRU CACHE CLASSES (Issue #5)
# =============================================================================
# Guard against class re-definition on subsequent script loads
if (-not ('LruCacheEntry' -as [type])) {
class LruCacheEntry {
[object]$Value
[datetime]$Timestamp

LruCacheEntry([object]$value) {
$this.Value = $value
$this.Timestamp = Get-Date
}
}
}

if (-not ('LruCache' -as [type])) {
class LruCache {
[System.Collections.Specialized.OrderedDictionary]$Cache
[int]$MaxSize
[int]$TtlSeconds
[int]$HitCount = 0
[int]$MissCount = 0
[int]$EvictionCount = 0

LruCache([int]$maxSize, [int]$ttlSeconds) {
$this.Cache = [System.Collections.Specialized.OrderedDictionary]::new()
$this.MaxSize = $maxSize
$this.TtlSeconds = $ttlSeconds
}

# Get value from cache (returns $null if not found or expired)
[object] Get([string]$key) {
if (-not $this.Cache.Contains($key)) {
$this.MissCount++
return $null
}

$entry = $this.Cache[$key]

# Check TTL expiration
if ($this.TtlSeconds -gt 0) {
$age = ((Get-Date) - $entry.Timestamp).TotalSeconds
if ($age -gt $this.TtlSeconds) {
$this.Cache.Remove($key)
$this.MissCount++
$this.EvictionCount++
return $null
}
}

# Move to end (most recently used) by removing and re-adding
$value = $entry.Value
$this.Cache.Remove($key)
$this.Cache[$key] = [LruCacheEntry]::new($value)
Comment on lines +113 to +115
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve original timestamp when moving entry to end.

When promoting an entry to the most-recently-used position, creating a new LruCacheEntry resets the timestamp to the current time. This incorrectly extends the TTL on every access, meaning cached entries will never expire if accessed frequently enough.

Apply this diff to preserve the original timestamp:

         # Move to end (most recently used) by removing and re-adding
         $value = $entry.Value
+        $originalTimestamp = $entry.Timestamp
         $this.Cache.Remove($key)
-        $this.Cache[$key] = [LruCacheEntry]::new($value)
+        $newEntry = [LruCacheEntry]::new($value)
+        $newEntry.Timestamp = $originalTimestamp
+        $this.Cache[$key] = $newEntry
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
$value = $entry.Value
$this.Cache.Remove($key)
$this.Cache[$key] = [LruCacheEntry]::new($value)
$value = $entry.Value
$originalTimestamp = $entry.Timestamp
$this.Cache.Remove($key)
$newEntry = [LruCacheEntry]::new($value)
$newEntry.Timestamp = $originalTimestamp
$this.Cache[$key] = $newEntry
🤖 Prompt for AI Agents
In hooks/handlers/token-optimizer-orchestrator.ps1 around lines 113 to 115, the
code creates a new LruCacheEntry when moving an item to the MRU position which
resets its timestamp and unintentionally extends TTL on access; instead preserve
the original timestamp by either moving the existing entry object without
reinitializing it or, if a new LruCacheEntry must be created, copy the original
entry's timestamp into the new object (e.g., capture $entry.Timestamp and pass
or set it on the new instance) so accesses do not refresh expiry.


$this.HitCount++
return $value
}

# Set value in cache
[void] Set([string]$key, [object]$value) {
# Remove if already exists (to re-insert at end)
if ($this.Cache.Contains($key)) {
$this.Cache.Remove($key)
}

# Evict least recently used if at capacity
if ($this.Cache.Count -ge $this.MaxSize) {
# First key is least recently used (OrderedDictionary maintains insertion order)
$firstKey = @($this.Cache.Keys)[0]
$this.Cache.Remove($firstKey)
$this.EvictionCount++
}

# Insert at end (most recently used)
$this.Cache[$key] = [LruCacheEntry]::new($value)
}

# Check if key exists and is not expired
[bool] ContainsKey([string]$key) {
return $null -ne $this.Get($key)
}
Comment on lines +141 to +143
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

ContainsKey has unexpected side effects.

Calling Get() from within ContainsKey() increments hit/miss counters and repositions the entry in the LRU order. This is unexpected behavior for a "check if key exists" method, which should be read-only.

Apply this diff to implement a side-effect-free check:

     # Check if key exists and is not expired
     [bool] ContainsKey([string]$key) {
-        return $null -ne $this.Get($key)
+        if (-not $this.Cache.Contains($key)) {
+            return $false
+        }
+        # Check TTL without updating counters or position
+        $entry = $this.Cache[$key]
+        if ($this.TtlSeconds -gt 0) {
+            $age = ((Get-Date) - $entry.Timestamp).TotalSeconds
+            if ($age -gt $this.TtlSeconds) {
+                return $false
+            }
+        }
+        return $true
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
[bool] ContainsKey([string]$key) {
return $null -ne $this.Get($key)
}
[bool] ContainsKey([string]$key) {
if (-not $this.Cache.Contains($key)) {
return $false
}
# Check TTL without updating counters or position
$entry = $this.Cache[$key]
if ($this.TtlSeconds -gt 0) {
$age = ((Get-Date) - $entry.Timestamp).TotalSeconds
if ($age -gt $this.TtlSeconds) {
return $false
}
}
return $true
}
🤖 Prompt for AI Agents
In hooks/handlers/token-optimizer-orchestrator.ps1 around lines 141 to 143,
ContainsKey currently calls Get which increments hit/miss counters and mutates
LRU state; replace this by inspecting the cache's internal storage directly
(e.g. the backing hashtable/dictionary or entries collection) to determine
presence without calling Get, avoid updating hit/miss counters or touching LRU
position, and ensure you still respect entry expiry by checking the stored
entry's expiry timestamp without performing any state mutations.


# Clear all entries
[void] Clear() {
$this.Cache.Clear()
$this.HitCount = 0
$this.MissCount = 0
$this.EvictionCount = 0
}

# Get cache statistics
[hashtable] GetStats() {
$totalRequests = $this.HitCount + $this.MissCount
return @{
Size = $this.Cache.Count
MaxSize = $this.MaxSize
HitCount = $this.HitCount
MissCount = $this.MissCount
EvictionCount = $this.EvictionCount
HitRate = if ($totalRequests -gt 0) {
[Math]::Round(($this.HitCount / $totalRequests) * 100, 2)
} else { 0 }
}
}

# Cleanup expired entries (call periodically)
[int] CleanupExpired() {
if ($this.TtlSeconds -le 0) { return 0 }

$removed = 0
$keysToRemove = @()

foreach ($key in $this.Cache.Keys) {
$entry = $this.Cache[$key]
$age = ((Get-Date) - $entry.Timestamp).TotalSeconds
if ($age -gt $this.TtlSeconds) {
$keysToRemove += $key
}
}

foreach ($key in $keysToRemove) {
$this.Cache.Remove($key)
$removed++
}

$this.EvictionCount += $removed
return $removed
}
}
}

# =============================================================================
# TOKEN COUNTER CLASS (Issue #4)
# =============================================================================
if (-not ('TokenCounter' -as [type])) {
class TokenCounter {
[string]$ApiKey
[string]$Model
[LruCache]$Cache
[int]$ApiCallCount = 0
[int]$CacheHitCount = 0
[int]$EstimationCount = 0

TokenCounter([string]$apiKey, [string]$model) {
$this.ApiKey = $apiKey
$this.Model = $model
# Use LRU cache: Max 200 entries, TTL 30 minutes (1800 seconds)
$this.Cache = [LruCache]::new(200, 1800)
}

# Primary method: try API first, fall back to estimation
[int] CountTokens([string]$text, [string]$contentType) {
# Check cache first (using content hash as key with proper disposal)
$sha256 = [System.Security.Cryptography.SHA256]::Create()
try {
$textHash = [System.BitConverter]::ToString(
$sha256.ComputeHash(
[System.Text.Encoding]::UTF8.GetBytes($text)
)
).Replace("-", "")
} finally {
$sha256.Dispose()
}
$cacheKey = "${contentType}:${textHash}"

$cached = $this.Cache.Get($cacheKey)
if ($null -ne $cached) {
$this.CacheHitCount++
return $cached
}

# Try API call if key is available
if ($this.ApiKey) {
try {
$tokenCount = $this.CountTokensViaAPI($text)
$this.ApiCallCount++
$this.Cache.Set($cacheKey, $tokenCount)
return $tokenCount
} catch {
# API failed, fall back to estimation (use Write-Host since Write-Log defined later)
Write-Host "WARN: Token counting API failed: $($_.Exception.Message), falling back to estimation" -ForegroundColor Yellow
}
}

# Fallback to improved estimation
$estimated = $this.EstimateTokens($text, $contentType)
$this.EstimationCount++
$this.Cache.Set($cacheKey, $estimated)
return $estimated
}

# Google AI API integration
[int] CountTokensViaAPI([string]$text) {
$requestBody = @{
contents = @(
@{
parts = @(
@{
text = $text
}
)
}
)
} | ConvertTo-Json -Depth 10 -Compress

$uri = "https://generativelanguage.googleapis.com/v1beta/models/$($this.Model):countTokens?key=$($this.ApiKey)"

try {
$response = Invoke-RestMethod -Uri $uri -Method POST -ContentType "application/json" -Body $requestBody -TimeoutSec 5
} catch {
$ex = $_.Exception
if ($ex -is [System.Net.WebException]) {
if ($ex.Status -eq [System.Net.WebExceptionStatus]::Timeout) {
throw "Token counting API timeout after 5 seconds"
} elseif ($ex.Status -eq [System.Net.WebExceptionStatus]::ConnectFailure) {
throw "Token counting API network error (connect failure)"
} else {
throw "Token counting API network error: $($ex.Status)"
}
} else {
throw
}
}

return $response.totalTokens
}
Comment on lines +255 to +288
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add null check for API response structure.

Line 287 returns $response.totalTokens without verifying that the response contains this property. If the API returns an unexpected format or error response, this will fail with a property access error.

Apply this diff to add defensive null checking:

+        if (-not $response -or -not $response.PSObject.Properties['totalTokens']) {
+            throw "API response missing 'totalTokens' property"
+        }
         return $response.totalTokens
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
[int] CountTokensViaAPI([string]$text) {
$requestBody = @{
contents = @(
@{
parts = @(
@{
text = $text
}
)
}
)
} | ConvertTo-Json -Depth 10 -Compress
$uri = "https://generativelanguage.googleapis.com/v1beta/models/$($this.Model):countTokens?key=$($this.ApiKey)"
try {
$response = Invoke-RestMethod -Uri $uri -Method POST -ContentType "application/json" -Body $requestBody -TimeoutSec 5
} catch {
$ex = $_.Exception
if ($ex -is [System.Net.WebException]) {
if ($ex.Status -eq [System.Net.WebExceptionStatus]::Timeout) {
throw "Token counting API timeout after 5 seconds"
} elseif ($ex.Status -eq [System.Net.WebExceptionStatus]::ConnectFailure) {
throw "Token counting API network error (connect failure)"
} else {
throw "Token counting API network error: $($ex.Status)"
}
} else {
throw
}
}
return $response.totalTokens
}
[int] CountTokensViaAPI([string]$text) {
$requestBody = @{
contents = @(
@{
parts = @(
@{
text = $text
}
)
}
)
} | ConvertTo-Json -Depth 10 -Compress
$uri = "https://generativelanguage.googleapis.com/v1beta/models/$($this.Model):countTokens?key=$($this.ApiKey)"
try {
$response = Invoke-RestMethod -Uri $uri -Method POST -ContentType "application/json" -Body $requestBody -TimeoutSec 5
} catch {
$ex = $_.Exception
if ($ex -is [System.Net.WebException]) {
if ($ex.Status -eq [System.Net.WebExceptionStatus]::Timeout) {
throw "Token counting API timeout after 5 seconds"
} elseif ($ex.Status -eq [System.Net.WebExceptionStatus]::ConnectFailure) {
throw "Token counting API network error (connect failure)"
} else {
throw "Token counting API network error: $($ex.Status)"
}
} else {
throw
}
}
if (-not $response -or -not $response.PSObject.Properties['totalTokens']) {
throw "API response missing 'totalTokens' property"
}
return $response.totalTokens
}
🤖 Prompt for AI Agents
In hooks/handlers/token-optimizer-orchestrator.ps1 around lines 255 to 288, the
function returns $response.totalTokens without validating the API response; add
defensive null checks after the Invoke-RestMethod call to ensure $response is
not $null and that $response.totalTokens exists and is an integer (or numeric);
if the check fails, throw a clear, descriptive exception (include a short
serialization of $response for debugging) so callers get a useful error instead
of a property-access failure.


# Improved estimation with content-type awareness
[int] EstimateTokens([string]$text, [string]$contentType) {
$baseRatio = [Math]::Ceiling($text.Length / 4.0)

switch ($contentType) {
"code" {
# Code has more tokens per character due to symbols/keywords
return [Math]::Ceiling($baseRatio * 1.2)
}
"json" {
# JSON structures add token overhead for delimiters
return [Math]::Ceiling($baseRatio * 1.15)
}
"markdown" {
# Markdown formatting adds token overhead
return [Math]::Ceiling($baseRatio * 1.1)
}
"text" {
# Plain text is slightly less than base ratio
return [Math]::Ceiling($baseRatio * 0.95)
}
default {
return $baseRatio
}
}
}

# Content type detection based on file extension or tool name
[string] DetectContentType([string]$identifier) {
switch -Regex ($identifier) {
'\.(cs|ps1|ts|js|py|java|cpp|c|h|go|rs|rb|php)$' { return "code" }
'\.(json|jsonc)$' { return "json" }
'\.(md|markdown)$' { return "markdown" }
'^(Read|Grep|Bash)$' { return "code" }
default { return "text" }
}
}

# Get cache statistics
[hashtable] GetStats() {
$cacheStats = $this.Cache.GetStats()
$totalCalls = $this.ApiCallCount + $this.EstimationCount
return @{
ApiCalls = $this.ApiCallCount
CacheHits = $this.CacheHitCount
EstimationCount = $this.EstimationCount
CacheSize = $cacheStats.Size
CacheHitRate = $cacheStats.HitRate
TotalCalls = $totalCalls
}
}
}
}

# Initialize global TokenCounter (singleton pattern)
if (-not $script:TokenCounter) {
$apiKey = $env:GOOGLE_AI_API_KEY
if (-not $apiKey) {
Write-Host "WARN: GOOGLE_AI_API_KEY not set, falling back to estimation only" -ForegroundColor Yellow
}
$modelName = if ($env:GOOGLE_AI_MODEL) { $env:GOOGLE_AI_MODEL } else { "gemini-2.0-flash-exp" }
$script:TokenCounter = [TokenCounter]::new($apiKey, $modelName)
}

# PHASE 2 FIX: Deterministic cache key generation
# Fixes 0% cache hit rate by ensuring identical operations produce identical keys
function Get-DeterministicCacheKey {
Expand Down
Loading