https-deeplearning-ai · TJay666 · Aug 16, 2025 · Aug 17, 2025 · Aug 17, 2025 · Aug 17, 2025
diff --git a/.claude/settings.local.json b/.claude/settings.local.json
@@ -0,0 +1,14 @@
+{
+  "permissions": {
+    "allow": [
+      "mcp__playwright__browser_navigate",
+      "mcp__playwright__browser_snapshot",
+      "mcp__playwright__browser_take_screenshot",
+      "Bash(uv run:*)",
+      "Bash(git add:*)"
+    ],
+    "deny": [],
+    "ask": [],
+    "defaultMode": "acceptEdits"
+  }
+}
diff --git a/.playwright-mcp/page-2025-08-17T01-20-03-573Z.png b/.playwright-mcp/page-2025-08-17T01-20-03-573Z.png
diff --git a/.playwright-mcp/page-2025-08-17T01-21-14-257Z.png b/.playwright-mcp/page-2025-08-17T01-21-14-257Z.png
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,98 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Development Commands
+
+### Running the Application
+```bash
+# Quick start (recommended)
+./run.sh
+
+# Manual start
+cd backend && uv run uvicorn app:app --reload --port 8000
+```
+
+### Environment Setup
+```bash
+# Install dependencies
+uv sync
+
+# Environment variables required in .env:
+ANTHROPIC_API_KEY=your_key_here
+```
+
+### Development Server
+- Web Interface: http://localhost:8000
+- API Documentation: http://localhost:8000/docs
+- Uses uvicorn with auto-reload for development
+
+## Architecture Overview
+
+This is a RAG (Retrieval-Augmented Generation) system for course materials with a clear separation between frontend, API, and processing layers.
+
+### Core RAG Flow
+1. **Document Processing**: Course materials in `docs/` are parsed into structured lessons and chunked for vector storage
+2. **Query Processing**: User queries trigger semantic search through ChromaDB, then Claude synthesizes responses
+3. **Session Management**: Conversation history is maintained per session for context-aware responses
+
+### Key Components
+
+**RAG System (`rag_system.py`)**: Main orchestrator that coordinates all components. Handles the complete query lifecycle from user input to response generation.
+
+**Document Processor (`document_processor.py`)**: Parses course documents with expected format:
+```
+Course Title: [title]
+Course Link: [url]  
+Course Instructor: [instructor]
+
+Lesson 0: Introduction
+Lesson Link: [lesson_url]
+[content...]
+```
+
+**Vector Store (`vector_store.py`)**: ChromaDB integration with sentence transformers for semantic search. Stores both course metadata and content chunks with configurable overlap.
+
+**AI Generator (`ai_generator.py`)**: Anthropic Claude integration with tool calling. Uses a specialized system prompt for educational content and decides when to search vs. use general knowledge.
+
+**Session Manager (`session_manager.py`)**: Maintains conversation history with configurable message limits. Creates unique session IDs for context preservation.
+
+### Configuration System
+All settings centralized in `config.py` with environment variable support:
+- Chunk size/overlap for document processing
+- Embedding model selection  
+- Search result limits
+- Conversation history depth
+- Claude model selection
+
+### Data Models
+Pydantic models in `models.py` define the core entities:
+- `Course`: Container with lessons and metadata
+- `Lesson`: Individual lesson with optional links
+- `CourseChunk`: Vector-searchable content pieces with course/lesson context
+
+### Tool Integration
+The system uses a tool management pattern where Claude can call search tools via the `search_tools.py` module. Tools are registered with the AI generator and can be invoked based on query analysis.
+
+### Frontend Integration
+Static files served from `frontend/` with a chat interface that maintains session state and displays responses with source citations. Uses relative API paths for deployment flexibility.
+
+## File Structure Context
+
+- `backend/app.py`: FastAPI application with CORS configuration and static file serving
+- `docs/`: Course materials automatically loaded on startup
+- `chroma_db/`: Persistent vector database storage
+- Frontend files use cache-busting for development
+- No test framework currently configured
+
+## Development Notes
+
+- Documents are automatically processed and indexed on server startup
+- The system expects course documents to follow the structured format for proper parsing
+- Session state is maintained in memory (not persistent across restarts)
+- Vector embeddings use sentence-transformers with the all-MiniLM-L6-v2 model
+ Claude model configured for claude-3-7-sonnet-20250219 with educational prompt optimization
+- always use uv to run the server do not use pip directly
+- make sure to use uv to all dependency
+- use uv to run Python files
+- always think harder and provide a detailed plan and ask for permission  before starting change or edit files.
diff --git a/backend-tool-refactor.md b/backend-tool-refactor.md
@@ -0,0 +1,28 @@
+Refactor @backend/ai_generator.py to support sequential tool calling where Claude can make up to 2 tool calls in separate API rounds.
+
+Current behavior:
+- Claude makes 1 tool call → tools are removed from API params → final response
+- If Claude wants another tool call after seeing results, it can't (gets empty response)
+
+Desired behavior:
+- Each tool call should be a separate API request where Claude can reason about previous results
+- Support for complex queries requiring multiple searches for comparisons, multi-part questions, or when information from different courses/lessons is needed
+
+Example flow:
+1. User: "Search for a course that discusses the same topic as lesson 4 of course X"
+2. Claude: get course outline for course X → gets title of lesson 4
+3. Claude: uses the title to search for a course that discusses the same topic → returns course information
+4. Claude: provides complete answer
+
+Requirements:
+- Maximum 2 sequential rounds per user query
+- Terminate when: (a) 2 rounds completed, (b) Claude's response has no tool_use blocks, or (c) tool call fails
+- Preserve conversation context between rounds
+- Handle tool execution errors gracefully
+
+Notes: 
+- Update the system prompt in @backend/ai_generator.py 
+- Update the test @backend/tests/test_ai_generator.py
+- Write tests that verify the external behavior (API calls made, tools executed, results returned) rather than internal state details. 
+
+Use two parallel subagents to brainstorm possible plans. Do not implement any code.
diff --git a/backend/ai_generator.py b/backend/ai_generator.py
@@ -5,21 +5,30 @@ class AIGenerator:
     """Handles interactions with Anthropic's Claude API for generating responses"""
 
     # Static system prompt to avoid rebuilding on each call
-    SYSTEM_PROMPT = """ You are an AI assistant specialized in course materials and educational content with access to a comprehensive search tool for course information.
+    SYSTEM_PROMPT = """ You are an AI assistant specialized in course materials and educational content with access to comprehensive tools for course information.
 
-Search Tool Usage:
-- Use the search tool **only** for questions about specific course content or detailed educational materials
-- **One search per query maximum**
-- Synthesize search results into accurate, fact-based responses
-- If search yields no results, state this clearly without offering alternatives
+Tool Usage Guidelines:
+- **Content Search Tool**: Use for questions about specific course content or detailed educational materials
+- **Course Outline Tool**: Use for questions about course structure, lesson lists, course overviews, or when users ask "what's in this course"
+- **Sequential Tool Calling**: You can make multiple tool calls across up to 2 rounds of interaction to gather comprehensive information
+- **Round 1**: Use tools to gather initial information
+- **Round 2**: Use additional tools if needed to gather more context, compare information, or clarify details
+- **Reasoning**: After each tool call, analyze results and determine if additional information is needed for a complete answer
+- Synthesize all tool results into accurate, fact-based responses
+- If tools yield no results, state this clearly without offering alternatives
 
 Response Protocol:
-- **General knowledge questions**: Answer using existing knowledge without searching
-- **Course-specific questions**: Search first, then answer
+- **General knowledge questions**: Answer using existing knowledge without using tools
+- **Course content questions**: Use content search tool first, then answer
+- **Course outline/structure questions**: Use outline tool first, then answer
 - **No meta-commentary**:
- - Provide direct answers only — no reasoning process, search explanations, or question-type analysis
- - Do not mention "based on the search results"
+ - Provide direct answers only — no reasoning process, tool explanations, or question-type analysis
+ - Do not mention "based on the search results" or "based on the outline"
 
+For outline queries, always include:
+- Course title and link
+- Course instructor
+- Complete lesson list with numbers and titles
 
 All responses must be:
 1. **Brief, Concise and focused** - Get to the point quickly
@@ -43,15 +52,17 @@ def __init__(self, api_key: str, model: str):
     def generate_response(self, query: str,
                          conversation_history: Optional[str] = None,
                          tools: Optional[List] = None,
-                         tool_manager=None) -> str:
+                         tool_manager=None,
+                         max_rounds: int = 2) -> str:
         """
-        Generate AI response with optional tool usage and conversation context.
+        Generate AI response with sequential tool usage support and conversation context.
 
         Args:
             query: The user's question or request
             conversation_history: Previous messages for context
             tools: Available tools the AI can use
             tool_manager: Manager to execute tools
+            max_rounds: Maximum sequential tool calls (default: 2)
 
         Returns:
             Generated response as string
@@ -64,31 +75,94 @@ def generate_response(self, query: str,
             else self.SYSTEM_PROMPT
         )
 
-        # Prepare API call parameters efficiently
-        api_params = {
+        # Start with the original user query
+        current_messages = [{"role": "user", "content": query}]
+
+        # Sequential tool calling loop
+        for round_num in range(max_rounds):
+            # Prepare API call parameters
+            api_params = {
+                **self.base_params,
+                "messages": current_messages.copy(),
+                "system": system_content
+            }
+
+            # Add tools if available
+            if tools:
+                api_params["tools"] = tools
+                api_params["tool_choice"] = {"type": "auto"}
+
+            # Get response from Claude
+            response = self.client.messages.create(**api_params)
+
+            # If no tool use, we're done
+            if response.stop_reason != "tool_use" or not tool_manager:
+                return response.content[0].text
+
+            # Handle tool execution and update messages
+            current_messages = self._handle_tool_execution_sequential(
+                response, current_messages, tool_manager
+            )
+
+            # If tool execution failed, return error message
+            if current_messages is None:
+                return "I encountered an error while processing your request."
+
+        # If we've completed max rounds with tools, make final call without tools
+        final_params = {
             **self.base_params,
-            "messages": [{"role": "user", "content": query}],
+            "messages": current_messages,
             "system": system_content
         }
 
-        # Add tools if available
-        if tools:
-            api_params["tools"] = tools
-            api_params["tool_choice"] = {"type": "auto"}
-
-        # Get response from Claude
-        response = self.client.messages.create(**api_params)
-
-        # Handle tool execution if needed
-        if response.stop_reason == "tool_use" and tool_manager:
-            return self._handle_tool_execution(response, api_params, tool_manager)
+        final_response = self.client.messages.create(**final_params)
+        return final_response.content[0].text
+
+    def _handle_tool_execution_sequential(self, response, messages: List, tool_manager):
+        """
+        Handle tool execution for sequential calling and return updated messages.
 
-        # Return direct response
-        return response.content[0].text
+        Args:
+            response: The response containing tool use requests
+            messages: Current message history
+            tool_manager: Manager to execute tools
+
+        Returns:
+            Updated messages list or None if tool execution fails
+        """
+        try:
+            # Add AI's tool use response to messages
+            messages.append({"role": "assistant", "content": response.content})
+
+            # Execute all tool calls and collect results
+            tool_results = []
+            for content_block in response.content:
+                if content_block.type == "tool_use":
+                    tool_result = tool_manager.execute_tool(
+                        content_block.name, 
+                        **content_block.input
+                    )
+
+                    tool_results.append({
+                        "type": "tool_result",
+                        "tool_use_id": content_block.id,
+                        "content": tool_result
+                    })
+
+            # Add tool results as user message
+            if tool_results:
+                messages.append({"role": "user", "content": tool_results})
+
+            return messages
+
+        except Exception as e:
+            # Log error and return None to indicate failure
+            print(f"Tool execution error: {e}")
+            return None
 
     def _handle_tool_execution(self, initial_response, base_params: Dict[str, Any], tool_manager):
         """
-        Handle execution of tool calls and get follow-up response.
+        Original single tool execution method - kept for backward compatibility.
 
         Args:
             initial_response: The response containing tool use requests
@@ -98,38 +172,21 @@ def _handle_tool_execution(self, initial_response, base_params: Dict[str, Any],
         Returns:
             Final response text after tool execution
         """
-        # Start with existing messages
+        # Use the sequential method but return just the final response
         messages = base_params["messages"].copy()
+        updated_messages = self._handle_tool_execution_sequential(
+            initial_response, messages, tool_manager
+        )
 
-        # Add AI's tool use response
-        messages.append({"role": "assistant", "content": initial_response.content})
-
-        # Execute all tool calls and collect results
-        tool_results = []
-        for content_block in initial_response.content:
-            if content_block.type == "tool_use":
-                tool_result = tool_manager.execute_tool(
-                    content_block.name, 
-                    **content_block.input
-                )
-
-                tool_results.append({
-                    "type": "tool_result",
-                    "tool_use_id": content_block.id,
-                    "content": tool_result
-                })
-
-        # Add tool results as single message
-        if tool_results:
-            messages.append({"role": "user", "content": tool_results})
+        if updated_messages is None:
+            return "I encountered an error while processing your request."
 
-        # Prepare final API call without tools
+        # Make final call to get response
         final_params = {
             **self.base_params,
-            "messages": messages,
+            "messages": updated_messages,
             "system": base_params["system"]
         }
 
-        # Get final response
         final_response = self.client.messages.create(**final_params)
         return final_response.content[0].text