llm-caching

Here is 1 public repository matching this topic...

zakariaf / RAG-Cache

High-performance LLM query cache with semantic search. Reduce API costs 80% and latency from 8.5s to 1ms using Redis + Qdrant vector DB. Multi-provider support (OpenAI, Anthropic).

redis embeddings openai cost-optimization rag fastapi vector-database qdrant semantic-cache llm-caching

Updated Dec 2, 2025
Python

Improve this page

Add a description, image, and links to the llm-caching topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-caching topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-caching

Here is 1 public repository matching this topic...

zakariaf / RAG-Cache

Improve this page

Add this topic to your repo