TracingRAG

An enhanced Retrieval-Augmented Generation (RAG) system that combines temporal tracing, graph relationships, and agentic retrieval to provide intelligent, context-aware knowledge management.

Key Features

Core Capabilities

Temporal Tracing: Track the evolution of knowledge over time with full history
Graph Relationships: Connect related concepts and states
Intelligent Relationship Updates: Automatically updates memory connections when states evolve using semantic similarity and LLM analysis
Agentic Retrieval: Intelligent, multi-step retrieval strategies
Memory Promotion: Synthesize new knowledge states from historical data with conflict resolution and quality checks
Time-Travel Queries: Query knowledge as it existed at any point in time

Human-Like Memory (What Makes TracingRAG Special)

Edge-Based Relevance: Contextual importance via edge weights, NOT filtering
- All states always accessible - nothing forgotten
- Edge strength represents contextual relevance (0.0-1.0)
- Low-strength edges: connections exist but ranked lower
- High-strength edges: prioritized during graph traversal
- Graph structure ensures even "distant" memories are discoverable
Importance Learning: System learns what's important from access patterns
Multi-Layer Caching: Redis-backed caching for embeddings, queries, and frequently accessed states
Hierarchical Consolidation: Auto-summarizes at daily/weekly/monthly levels (like human sleep)
Latest State Tracking: Instant O(1) lookup for "what's the current status?" queries (materialized view)
Graph-Based Relevance: Even old/low-strength memories found via edges to latest states
Storage Tier Support: Infrastructure for hot/warm/cold storage (working/active/archived)

Scale & Performance

Instant Latest: <10ms for current state queries (PostgreSQL materialized views)
Cached Queries: <10ms for frequently accessed data (Redis caching)
Full Search: <100ms with vector search optimization and caching
Scalable Storage: Supports millions of states with PostgreSQL + TimescaleDB partitioning
Space Efficient: Diff-based versioning support for reduced storage overhead

Architecture

TracingRAG uses a multi-layer architecture:

Storage Layer: Qdrant (vectors), Neo4j (graphs), PostgreSQL + TimescaleDB (documents)
Core Services: Memory management, graph operations, embeddings, caching (Redis)
Agentic Layer: LLM-based query planning, memory promotion, retrieval orchestration
API Layer: FastAPI REST endpoints with async support

See docs/ for detailed documentation on architecture, usage, and deployment.

Tech Stack

Required:

Python 3.11+ with FastAPI and asyncio
Qdrant for vector storage and semantic search
Neo4j for graph database and relationship tracking
PostgreSQL + TimescaleDB for document storage and temporal queries
OpenRouter API for LLM access (structured output, query analysis, synthesis)

Embeddings (choose one):

SentenceTransformers (default, free, runs locally)
- all-mpnet-base-v2 - English, 768 dim (default)
- paraphrase-multilingual-mpnet-base-v2 - 50+ languages, 768 dim
OpenAI Embeddings (optional, best multilingual support)
- text-embedding-3-small - 100+ languages, 1536 dim
- text-embedding-3-large - 100+ languages, 3072 dim
- Automatic fallback if local model fails

Optional:

Redis for caching (embeddings, queries, working memory)
- If not available: uses in-memory LRU cache (1000 items max)
- Recommended for production

Monitoring:

Prometheus for metrics
Structlog for structured JSON logging

Quick Start

Prerequisites

Python 3.11+ (including Python 3.14)
- Note for Python 3.14+: Some dependencies (like greenlet) need to compile from source since pre-built wheels aren't available yet. You'll need:
  - macOS: xcode-select --install
  - Ubuntu/Debian: sudo apt install build-essential python3-dev
  - Other systems: C compiler and Python development headers
- Recommended: Use Python 3.11-3.13 for the smoothest installation (pre-built wheels available)
Docker and Docker Compose
Poetry (Python package manager)

Installation

Clone the repository:

git clone <repository-url>
cd TracingRAG

Install Poetry (if not already installed):

curl -sSL https://install.python-poetry.org | python3 -

(Optional) If you have multiple Python versions installed:

# Poetry will automatically use the correct Python version
# But you can specify which one explicitly:
poetry env use python3.11   # Use Python 3.11
# OR
poetry env use python3.12   # Use Python 3.12
# OR
poetry env use python3.13   # Use Python 3.13
# OR
poetry env use python3.14   # Use Python 3.14 (requires build tools, see prerequisites)

# Check which Python is being used:
poetry env info

Install dependencies:

poetry install

Copy environment variables and configure:

cp .env.example .env
# Edit .env with your configuration

Required configuration:

OPENROUTER_API_KEY - Your OpenRouter API key for LLM access

Embedding configuration (choose one):

Option 1: Local embeddings (default, free)

# English only (default)
EMBEDDING_MODEL=sentence-transformers/all-mpnet-base-v2

# OR for multilingual support (50+ languages)
EMBEDDING_MODEL=sentence-transformers/paraphrase-multilingual-mpnet-base-v2

Option 2: OpenAI embeddings (best multilingual, API costs)

OPENAI_API_KEY=your_openai_api_key_here
OPENAI_EMBEDDING_MODEL=text-embedding-3-small  # 100+ languages

Optional configuration:

Redis caching (recommended for production): Already configured in docker-compose.yml

Start infrastructure services:

docker-compose up -d

Run database migrations:

poetry run alembic upgrade head

Start the API server:

poetry run uvicorn tracingrag.api.main:app --reload

The API will be available at http://localhost:8000

API documentation: http://localhost:8000/docs

REST API

TracingRAG provides a comprehensive REST API for all operations:

Available Endpoints

System:

GET / - API information
GET /health - Health check
GET /metrics - System metrics

Memory Management:

POST /api/v1/memories - Create memory state
GET /api/v1/memories/{id} - Get memory by ID
GET /api/v1/memories - List memories (with pagination and filtering)
GET /api/v1/traces/{topic} - Get version history for a topic

Query/RAG:

POST /api/v1/query - Query the RAG system (supports both standard and agent-based retrieval)

Promotion:

POST /api/v1/promote - Promote a memory state
GET /api/v1/promotion-candidates - Get topics that are candidates for promotion

Interactive Documentation

Swagger UI: http://localhost:8000/docs - Interactive API explorer
ReDoc: http://localhost:8000/redoc - API reference documentation
OpenAPI JSON: http://localhost:8000/openapi.json - Machine-readable API spec

Quick API Example

# Create a memory
curl -X POST http://localhost:8000/api/v1/memories \
  -H "Content-Type: application/json" \
  -d '{
    "topic": "project_alpha",
    "content": "Initial design for API authentication",
    "tags": ["design", "security"],
    "confidence": 0.95
  }'

# Query the RAG system
curl -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the status of project alpha?",
    "use_agent": false
  }'

# Get promotion candidates
curl "http://localhost:8000/api/v1/promotion-candidates?limit=10&min_priority=7"

For complete API documentation, see docs/API_GUIDE.md.

Project Structure

TracingRAG/
├── tracingrag/                 # Main package
│   ├── __init__.py
│   ├── core/                   # Core domain models
│   │   └── models/            # Data models (memory, graph, rag, promotion)
│   ├── services/              # Business logic layer
│   │   ├── memory.py          # Memory state management
│   │   ├── rag.py             # RAG pipeline orchestration
│   │   ├── graph.py           # Graph relationship management
│   │   ├── retrieval.py       # Retrieval strategies
│   │   ├── promotion.py       # Memory promotion & synthesis
│   │   ├── embedding.py       # Embedding generation
│   │   ├── cache.py           # Caching layer (Redis)
│   │   └── ...                # Other services
│   ├── storage/               # Storage layer
│   │   ├── qdrant.py          # Qdrant vector database
│   │   ├── neo4j_client.py    # Neo4j graph database
│   │   ├── database.py        # PostgreSQL integration
│   │   ├── redis_client.py    # Redis caching
│   │   └── models.py          # SQLAlchemy models
│   ├── agents/                # Agentic layer
│   │   ├── query_planner.py   # Query planning agent
│   │   ├── memory_manager.py  # Memory management agent
│   │   ├── service.py         # Agent orchestration
│   │   └── tools.py           # Agent tools
│   ├── api/                   # API layer
│   │   ├── main.py            # FastAPI application
│   │   ├── schemas.py         # Pydantic schemas
│   │   └── security.py        # Authentication & authorization
│   └── utils/                 # Utilities
├── tests/                     # Test suite
├── scripts/                   # Utility scripts
├── docs/                      # Additional documentation
├── examples/                  # Usage examples
├── alembic/                   # Database migrations
├── k8s/                       # Kubernetes manifests
├── docker-compose.yml         # Local development infrastructure
├── Dockerfile                 # Application container
├── pyproject.toml            # Poetry configuration
├── .env.example              # Environment variables template
└── README.md                 # This file

Usage Examples

Creating a Memory State

from tracingrag.client import TracingRAGClient

client = TracingRAGClient("http://localhost:8000")

# Create initial memory state
state = client.create_memory(
    topic="project_alpha",
    content="Starting development of feature X with approach Y",
    tags=["project", "development"]
)

Querying with Context

# Query for relevant memories
results = client.query(
    query="What's the status of project alpha?",
    include_history=True,  # Include trace context
    include_related=True,  # Include graph connections
    depth=2  # Graph traversal depth
)

for result in results:
    print(f"Topic: {result.topic}")
    print(f"Content: {result.content}")
    print(f"Version: {result.version}")
    print(f"Related: {[r.topic for r in result.related_states]}")

Promoting Memory

# Promote memory to new state with synthesis
new_state = client.promote_memory(
    topic="project_alpha",
    reason="Bug discovered and fixed, feature complete"
)

# The system will:
# 1. Analyze trace history
# 2. Find related states (e.g., bug reports)
# 3. Synthesize new state with LLM
# 4. Create appropriate graph edges

Time-Travel Query

from datetime import datetime, timedelta

# What did we know about this topic 2 weeks ago?
past_state = client.query_at_time(
    topic="project_alpha",
    timestamp=datetime.now() - timedelta(weeks=2)
)

Development

Running Tests

poetry run pytest

Linting and Formatting

poetry run ruff check .
poetry run ruff format .

Type Checking

poetry run mypy tracingrag

Configuration

Key environment variables (see .env.example):

OPENROUTER_API_KEY: OpenRouter API key for LLM access
QDRANT_URL: Qdrant server URL
QDRANT_API_KEY: Qdrant API key (if using cloud)
NEO4J_URI: Neo4j connection URI
NEO4J_USERNAME: Neo4j username
NEO4J_PASSWORD: Neo4j password
DATABASE_URL: PostgreSQL connection URL
EMBEDDING_MODEL: Model to use for embeddings (default: all-mpnet-base-v2)

Roadmap

Phase 1: Foundation - Core data models, storage interfaces, and basic infrastructure
Phase 2: Retrieval Services - Semantic search, graph-enhanced retrieval, temporal queries, hybrid retrieval
Phase 3: Graph Layer - Edge management, relationship types, temporal validity, graph traversal
Phase 4: Basic RAG - Query processing, context building, LLM integration, response generation
Phase 5: Agentic Layer - Intelligent agents for query planning and memory management
Phase 6: Memory Promotion - Synthesis capabilities and knowledge consolidation
Phase 7: Advanced Features - Redis caching, hierarchical consolidation, performance optimization
Phase 8: Production Ready - Security, monitoring, CI/CD, Kubernetes deployment

🎉 TracingRAG is now production-ready!

Production Deployment

TracingRAG is fully production-ready with:

Security: JWT authentication, API key support, rate limiting, input validation
Monitoring: Prometheus metrics (50+ metrics), structured logging, health checks
CI/CD: Automated testing, linting, Docker builds, security scanning
Kubernetes: Complete K8s manifests with autoscaling (HPA), ingress, TLS support
Performance: Multi-stage Docker builds, caching layers, optimized resource allocation

See docs/DEPLOYMENT_GUIDE.md for complete deployment instructions.

Contributing

Contributions are welcome! Please see docs/DEVELOPMENT.md for development setup and guidelines.

License

MIT License - see LICENSE for details.

Acknowledgments

Inspired by:

Zep's Graphiti - Temporal knowledge graphs
Microsoft GraphRAG - Graph-based RAG
LangChain - LLM application patterns

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
.github/workflows		.github/workflows
alembic		alembic
docs		docs
examples		examples
k8s		k8s
monitoring		monitoring
scripts		scripts
tests		tests
tracingrag		tracingrag
.env.example		.env.example
.env.test		.env.test
.gitignore		.gitignore
.trivyignore		.trivyignore
Dockerfile		Dockerfile
Dockerfile.prod		Dockerfile.prod
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

License

JasonXuDeveloper/TracingRAG

Folders and files

Latest commit

History

Repository files navigation

TracingRAG

Key Features

Core Capabilities

Human-Like Memory (What Makes TracingRAG Special)

Scale & Performance

Architecture

Tech Stack

Quick Start

Prerequisites

Installation

REST API

Available Endpoints

Interactive Documentation

Quick API Example

Project Structure

Usage Examples

Creating a Memory State

Querying with Context

Promoting Memory

Time-Travel Query

Development

Running Tests

Linting and Formatting

Type Checking

Configuration

Roadmap

Production Deployment

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages