Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .claude/settings.local.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"permissions": {
"allow": [
"mcp__playwright__browser_navigate",
"mcp__playwright__browser_snapshot",
"mcp__playwright__browser_take_screenshot",
"Bash(uv run:*)",
"Bash(git add:*)"
],
"deny": [],
"ask": [],
"defaultMode": "acceptEdits"
}
}
Binary file added .playwright-mcp/page-2025-08-17T01-20-03-573Z.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .playwright-mcp/page-2025-08-17T01-21-14-257Z.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
98 changes: 98 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Development Commands

### Running the Application
```bash
# Quick start (recommended)
./run.sh

# Manual start
cd backend && uv run uvicorn app:app --reload --port 8000
```

### Environment Setup
```bash
# Install dependencies
uv sync

# Environment variables required in .env:
ANTHROPIC_API_KEY=your_key_here
```

### Development Server
- Web Interface: http://localhost:8000
- API Documentation: http://localhost:8000/docs
- Uses uvicorn with auto-reload for development

## Architecture Overview

This is a RAG (Retrieval-Augmented Generation) system for course materials with a clear separation between frontend, API, and processing layers.

### Core RAG Flow
1. **Document Processing**: Course materials in `docs/` are parsed into structured lessons and chunked for vector storage
2. **Query Processing**: User queries trigger semantic search through ChromaDB, then Claude synthesizes responses
3. **Session Management**: Conversation history is maintained per session for context-aware responses

### Key Components

**RAG System (`rag_system.py`)**: Main orchestrator that coordinates all components. Handles the complete query lifecycle from user input to response generation.

**Document Processor (`document_processor.py`)**: Parses course documents with expected format:
```
Course Title: [title]
Course Link: [url]
Course Instructor: [instructor]

Lesson 0: Introduction
Lesson Link: [lesson_url]
[content...]
```

**Vector Store (`vector_store.py`)**: ChromaDB integration with sentence transformers for semantic search. Stores both course metadata and content chunks with configurable overlap.

**AI Generator (`ai_generator.py`)**: Anthropic Claude integration with tool calling. Uses a specialized system prompt for educational content and decides when to search vs. use general knowledge.

**Session Manager (`session_manager.py`)**: Maintains conversation history with configurable message limits. Creates unique session IDs for context preservation.

### Configuration System
All settings centralized in `config.py` with environment variable support:
- Chunk size/overlap for document processing
- Embedding model selection
- Search result limits
- Conversation history depth
- Claude model selection

### Data Models
Pydantic models in `models.py` define the core entities:
- `Course`: Container with lessons and metadata
- `Lesson`: Individual lesson with optional links
- `CourseChunk`: Vector-searchable content pieces with course/lesson context

### Tool Integration
The system uses a tool management pattern where Claude can call search tools via the `search_tools.py` module. Tools are registered with the AI generator and can be invoked based on query analysis.

### Frontend Integration
Static files served from `frontend/` with a chat interface that maintains session state and displays responses with source citations. Uses relative API paths for deployment flexibility.

## File Structure Context

- `backend/app.py`: FastAPI application with CORS configuration and static file serving
- `docs/`: Course materials automatically loaded on startup
- `chroma_db/`: Persistent vector database storage
- Frontend files use cache-busting for development
- No test framework currently configured

## Development Notes

- Documents are automatically processed and indexed on server startup
- The system expects course documents to follow the structured format for proper parsing
- Session state is maintained in memory (not persistent across restarts)
- Vector embeddings use sentence-transformers with the all-MiniLM-L6-v2 model
Claude model configured for claude-3-7-sonnet-20250219 with educational prompt optimization
- always use uv to run the server do not use pip directly
- make sure to use uv to all dependency
- use uv to run Python files
- always think harder and provide a detailed plan and ask for permission before starting change or edit files.
28 changes: 28 additions & 0 deletions backend-tool-refactor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
Refactor @backend/ai_generator.py to support sequential tool calling where Claude can make up to 2 tool calls in separate API rounds.

Current behavior:
- Claude makes 1 tool call → tools are removed from API params → final response
- If Claude wants another tool call after seeing results, it can't (gets empty response)

Desired behavior:
- Each tool call should be a separate API request where Claude can reason about previous results
- Support for complex queries requiring multiple searches for comparisons, multi-part questions, or when information from different courses/lessons is needed

Example flow:
1. User: "Search for a course that discusses the same topic as lesson 4 of course X"
2. Claude: get course outline for course X → gets title of lesson 4
3. Claude: uses the title to search for a course that discusses the same topic → returns course information
4. Claude: provides complete answer

Requirements:
- Maximum 2 sequential rounds per user query
- Terminate when: (a) 2 rounds completed, (b) Claude's response has no tool_use blocks, or (c) tool call fails
- Preserve conversation context between rounds
- Handle tool execution errors gracefully

Notes:
- Update the system prompt in @backend/ai_generator.py
- Update the test @backend/tests/test_ai_generator.py
- Write tests that verify the external behavior (API calls made, tools executed, results returned) rather than internal state details.

Use two parallel subagents to brainstorm possible plans. Do not implement any code.
165 changes: 111 additions & 54 deletions backend/ai_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,30 @@ class AIGenerator:
"""Handles interactions with Anthropic's Claude API for generating responses"""

# Static system prompt to avoid rebuilding on each call
SYSTEM_PROMPT = """ You are an AI assistant specialized in course materials and educational content with access to a comprehensive search tool for course information.
SYSTEM_PROMPT = """ You are an AI assistant specialized in course materials and educational content with access to comprehensive tools for course information.

Search Tool Usage:
- Use the search tool **only** for questions about specific course content or detailed educational materials
- **One search per query maximum**
- Synthesize search results into accurate, fact-based responses
- If search yields no results, state this clearly without offering alternatives
Tool Usage Guidelines:
- **Content Search Tool**: Use for questions about specific course content or detailed educational materials
- **Course Outline Tool**: Use for questions about course structure, lesson lists, course overviews, or when users ask "what's in this course"
- **Sequential Tool Calling**: You can make multiple tool calls across up to 2 rounds of interaction to gather comprehensive information
- **Round 1**: Use tools to gather initial information
- **Round 2**: Use additional tools if needed to gather more context, compare information, or clarify details
- **Reasoning**: After each tool call, analyze results and determine if additional information is needed for a complete answer
- Synthesize all tool results into accurate, fact-based responses
- If tools yield no results, state this clearly without offering alternatives

Response Protocol:
- **General knowledge questions**: Answer using existing knowledge without searching
- **Course-specific questions**: Search first, then answer
- **General knowledge questions**: Answer using existing knowledge without using tools
- **Course content questions**: Use content search tool first, then answer
- **Course outline/structure questions**: Use outline tool first, then answer
- **No meta-commentary**:
- Provide direct answers only — no reasoning process, search explanations, or question-type analysis
- Do not mention "based on the search results"
- Provide direct answers only — no reasoning process, tool explanations, or question-type analysis
- Do not mention "based on the search results" or "based on the outline"

For outline queries, always include:
- Course title and link
- Course instructor
- Complete lesson list with numbers and titles

All responses must be:
1. **Brief, Concise and focused** - Get to the point quickly
Expand All @@ -43,15 +52,17 @@ def __init__(self, api_key: str, model: str):
def generate_response(self, query: str,
conversation_history: Optional[str] = None,
tools: Optional[List] = None,
tool_manager=None) -> str:
tool_manager=None,
max_rounds: int = 2) -> str:
"""
Generate AI response with optional tool usage and conversation context.
Generate AI response with sequential tool usage support and conversation context.

Args:
query: The user's question or request
conversation_history: Previous messages for context
tools: Available tools the AI can use
tool_manager: Manager to execute tools
max_rounds: Maximum sequential tool calls (default: 2)

Returns:
Generated response as string
Expand All @@ -64,31 +75,94 @@ def generate_response(self, query: str,
else self.SYSTEM_PROMPT
)

# Prepare API call parameters efficiently
api_params = {
# Start with the original user query
current_messages = [{"role": "user", "content": query}]

# Sequential tool calling loop
for round_num in range(max_rounds):
# Prepare API call parameters
api_params = {
**self.base_params,
"messages": current_messages.copy(),
"system": system_content
}

# Add tools if available
if tools:
api_params["tools"] = tools
api_params["tool_choice"] = {"type": "auto"}

# Get response from Claude
response = self.client.messages.create(**api_params)

# If no tool use, we're done
if response.stop_reason != "tool_use" or not tool_manager:
return response.content[0].text

# Handle tool execution and update messages
current_messages = self._handle_tool_execution_sequential(
response, current_messages, tool_manager
)

# If tool execution failed, return error message
if current_messages is None:
return "I encountered an error while processing your request."

# If we've completed max rounds with tools, make final call without tools
final_params = {
**self.base_params,
"messages": [{"role": "user", "content": query}],
"messages": current_messages,
"system": system_content
}

# Add tools if available
if tools:
api_params["tools"] = tools
api_params["tool_choice"] = {"type": "auto"}

# Get response from Claude
response = self.client.messages.create(**api_params)

# Handle tool execution if needed
if response.stop_reason == "tool_use" and tool_manager:
return self._handle_tool_execution(response, api_params, tool_manager)
final_response = self.client.messages.create(**final_params)
return final_response.content[0].text

def _handle_tool_execution_sequential(self, response, messages: List, tool_manager):
"""
Handle tool execution for sequential calling and return updated messages.

# Return direct response
return response.content[0].text
Args:
response: The response containing tool use requests
messages: Current message history
tool_manager: Manager to execute tools

Returns:
Updated messages list or None if tool execution fails
"""
try:
# Add AI's tool use response to messages
messages.append({"role": "assistant", "content": response.content})

# Execute all tool calls and collect results
tool_results = []
for content_block in response.content:
if content_block.type == "tool_use":
tool_result = tool_manager.execute_tool(
content_block.name,
**content_block.input
)

tool_results.append({
"type": "tool_result",
"tool_use_id": content_block.id,
"content": tool_result
})

# Add tool results as user message
if tool_results:
messages.append({"role": "user", "content": tool_results})

return messages

except Exception as e:
# Log error and return None to indicate failure
print(f"Tool execution error: {e}")
return None

def _handle_tool_execution(self, initial_response, base_params: Dict[str, Any], tool_manager):
"""
Handle execution of tool calls and get follow-up response.
Original single tool execution method - kept for backward compatibility.

Args:
initial_response: The response containing tool use requests
Expand All @@ -98,38 +172,21 @@ def _handle_tool_execution(self, initial_response, base_params: Dict[str, Any],
Returns:
Final response text after tool execution
"""
# Start with existing messages
# Use the sequential method but return just the final response
messages = base_params["messages"].copy()
updated_messages = self._handle_tool_execution_sequential(
initial_response, messages, tool_manager
)

# Add AI's tool use response
messages.append({"role": "assistant", "content": initial_response.content})

# Execute all tool calls and collect results
tool_results = []
for content_block in initial_response.content:
if content_block.type == "tool_use":
tool_result = tool_manager.execute_tool(
content_block.name,
**content_block.input
)

tool_results.append({
"type": "tool_result",
"tool_use_id": content_block.id,
"content": tool_result
})

# Add tool results as single message
if tool_results:
messages.append({"role": "user", "content": tool_results})
if updated_messages is None:
return "I encountered an error while processing your request."

# Prepare final API call without tools
# Make final call to get response
final_params = {
**self.base_params,
"messages": messages,
"messages": updated_messages,
"system": base_params["system"]
}

# Get final response
final_response = self.client.messages.create(**final_params)
return final_response.content[0].text
Loading