follow torvards-mode suggestion: integrate lock with state, and modify state in-place (#40)

xingyaoww · web-flow · commit 595f2d093d66 · 2025-08-28T22:47:38.000+08:00
diff --git a/README.md b/README.md
@@ -1,35 +1,167 @@
-# Prototype for OpenHands V1
-
-This folder contains my tasks of completely refactor [OpenHands](https://github.com/All-Hands-AI/OpenHands) project V0 into the new V1 version. There's a lot of changes, including (non-exhausive):
-
-- Switching from poetry to uv as package manager
-- better dependency management
-  - include `--dev` group for development only
-- stricter pre-commit hooks `.pre-commit-config.yaml` that includes
-  - type check through pyright
-  - linting and formatter with `uv ruff`
-- cleaner architecture for how a tool works and how it is executed
-  - read about how we define tools: [`openhands/core/runtime/tool.py`](openhands/core/runtime/tool.py)
-  - read about how we define schema (input/output) for tools: [`openhands/core/runtime/schema.py`](openhands/core/runtime/schema.py)
-  - read about patterns for how we define an executable tool:
-    - read [openhands/core/runtime/tools/str_replace_editor/impl.py](openhands/core/runtime/tools/str_replace_editor/impl.py) for tool execute_fn
-    - read [openhands/core/runtime/tools/str_replace_editor/definition.py](openhands/core/runtime/tools/str_replace_editor/definition.py) for how do we define a tool
-    - read [openhands/core/runtime/tools/str_replace_editor/__init__.py](openhands/core/runtime/tools/str_replace_editor/__init__.py) for how we define each tool module
-- tools: `str_replace_editor`, `execute_bash`
-- minimal config (OpenHandsConfig, LLMConfig, MCPConfig): `openhands/core/config`
-- core set of LLM (w/o tests): `openhands/core/llm`
-- core set of microagent functionality (w/o full integration):
-  - `openhands/core/context`: redesigned the triggering of microagents w.r.t. agents into the concept of two types context
-    - EnvContext (triggered at the begining of a convo)
-    - MessageContext (triggered at each user message)
-  - `openhands-v1/openhands/core/microagents`: old code from V1 that loads microagents from folders, etc
-- minimal implementation of codeact agent: `openhands-v1/openhands/core/agenthub/codeact_agent`
-- ...
-
-
-**Check hello world example**
+# OpenHands Agent SDK
+
+A clean, modular SDK for building AI agents with OpenHands. This project represents a complete architectural refactor from OpenHands V0, emphasizing simplicity, maintainability, and developer experience.
+
+## Repository Structure
+
+```
+agent-sdk/
+├── .github/
+│   └── workflows/           # CI/CD workflows
+│       ├── precommit.yml   # Pre-commit hook validation
+│       └── tests.yml       # Test execution pipeline
+├── .pre-commit-config.yaml # Pre-commit hooks configuration
+├── Makefile                # Build and development commands
+├── README.md               # This file
+├── pyproject.toml          # Root project configuration
+├── uv.lock                 # Dependency lock file
+├── examples/
+│   └── hello_world.py      # Getting started example
+├── openhands/              # Main SDK packages
+│   ├── core/               # Core SDK functionality
+│   │   ├── agent/          # Agent implementations
+│   │   │   ├── base.py     # Base agent interface
+│   │   │   └── codeact_agent/  # CodeAct agent implementation
+│   │   ├── config/         # Configuration management
+│   │   │   ├── llm_config.py   # LLM configuration
+│   │   │   └── mcp_config.py   # MCP configuration
+│   │   ├── context/        # Context management system
+│   │   │   ├── env_context.py      # Environment context
+│   │   │   ├── message_context.py  # Message context
+│   │   │   ├── history.py          # Conversation history
+│   │   │   ├── manager.py          # Context manager
+│   │   │   ├── prompt.py           # Prompt management
+│   │   │   └── microagents/        # Microagent system
+│   │   ├── conversation/   # Conversation management
+│   │   │   ├── conversation.py # Core conversation logic
+│   │   │   ├── serializer.py   # Conversation serialization
+│   │   │   ├── state.py        # Conversation state
+│   │   │   ├── types.py        # Type definitions
+│   │   │   └── visualizer.py   # Conversation visualization
+│   │   ├── llm/            # LLM integration layer
+│   │   │   ├── llm.py      # Main LLM interface
+│   │   │   ├── message.py  # Message handling
+│   │   │   ├── metadata.py # LLM metadata
+│   │   │   └── utils/      # LLM utilities
+│   │   ├── tool/           # Tool system
+│   │   │   ├── tool.py     # Core tool interface
+│   │   │   ├── schema.py   # Tool schema definitions
+│   │   │   └── builtins/   # Built-in tools
+│   │   ├── utils/          # Core utilities
+│   │   ├── logger.py       # Logging configuration
+│   │   ├── pyproject.toml  # Core package configuration
+│   │   └── tests/          # Unit tests for core
+│   └── tools/              # Tool implementations
+│       ├── execute_bash/   # Bash execution tool
+│       ├── str_replace_editor/  # String replacement editor
+│       ├── utils/          # Tool utilities
+│       ├── pyproject.toml  # Tools package configuration
+│       └── tests/          # Unit tests for tools
+└── tests/                  # Integration tests
+```
+
+## Quick Start
+
+```bash
+# Install dependencies
+make build
+
+# Run hello world example
+uv run python examples/hello_world.py
+
+# Run tests
+uv run pytest
+
+# Run pre-commit hooks
+uv run pre-commit run --all-files
+```
+
+## Development Guidelines
+
+### Core Principles
+
+This project follows principles of simplicity, pragmatism, and maintainability:
+
+1. **Simplicity First**: If it needs more than 3 levels of indentation, redesign it
+2. **No Special Cases**: Good code eliminates edge cases through proper data structure design
+3. **Pragmatic Solutions**: Solve real problems, not imaginary ones
+4. **Never Break Userspace**: Backward compatibility is sacred
+
+### Architecture Overview
+
+The SDK is built around two core packages:
+
+- **`openhands/core`**: Core SDK functionality (agents, LLM, context, conversation)
+- **`openhands/tools`**: Tool implementations (bash execution, file editing)
+
+Each package is independently testable and deployable, with clear separation of concerns.
+
+### Development Workflow
+
+#### 1. Environment Setup
 
 ```bash
+# Initial setup
+make build
+
+# Activate virtual environment (if needed)
+source .venv/bin/activate
+```
+
+#### 2. Code Quality Standards
+
+- **Type Checking**: All code must pass `pyright` type checking
+- **Linting**: Code must pass `ruff` linting and formatting
+- **Testing**: Maintain test coverage for new functionality
+- **Documentation**: Code should be self-documenting; avoid redundant comments
+
+#### 3. Pre-commit Workflow
+
+Before every commit:
+
+```bash
+# Run pre-commit hooks on changed files
+uv run pre-commit run --files <filepath>
+
+# Or run on all files
+uv run pre-commit run --all-files
+```
+
+#### 4. Testing Strategy
+
+**Unit Tests**: Located in package-specific test directories
+- `openhands/core/tests/` - Tests for core functionality
+- `openhands/tools/tests/` - Tests for tool implementations
+
+**Integration Tests**: Located in root `tests/` directory
+- Tests that involve both core and tools packages
+
+**Running Tests**:
+```bash
+# Run all tests
+uv run pytest
+
+# Run specific test file
+uv run pytest openhands/core/tests/tool/test_tool.py
+
+# Run with coverage
+uv run pytest --cov=openhands
+```
+
+#### 5. Package Management
+
+This project uses `uv` for dependency management:
+
+```bash
+# Add a new dependency
+uv add package-name
+
+# Add a development dependency
+uv add --dev package-name
+
+# Update dependencies
+uv lock --upgrade
+
+# Install from lock file
 uv sync
-uv run python examples/hello.py
 ```
diff --git a/openhands/core/agent/base.py b/openhands/core/agent/base.py
@@ -62,13 +62,15 @@ def init_state(
         state: ConversationState,
         initial_user_message: Message | None = None,
         on_event: ConversationCallbackType | None = None,
-    ) -> ConversationState:
+    ) -> None:
         """Initialize the empty conversation state to prepare the agent for user messages.
 
         Typically this involves:
         1. Adding system message
         2. Adding initial user messages with environment context
             (e.g., microagents, current working dir, etc)
+
+        NOTE: state will be mutated in-place.
         """
         raise NotImplementedError("Subclasses must implement this method.")
 
@@ -77,7 +79,7 @@ def step(
         self,
         state: ConversationState,
         on_event: ConversationCallbackType | None = None,
-    ) -> ConversationState:
+    ) -> None:
         """Taking a step in the conversation.
 
         Typically this involves:
@@ -87,5 +89,7 @@ def step(
             LLM calls (role="assistant") and tool results (role="tool")
         4.1 If conversation is finished, set state.agent_finished flag
         4.2 Otherwise, just return, Conversation will kick off the next step
+
+        NOTE: state will be mutated in-place.
         """
         raise NotImplementedError("Subclasses must implement this method.")
diff --git a/openhands/core/agent/codeact_agent/codeact_agent.py b/openhands/core/agent/codeact_agent/codeact_agent.py
@@ -47,7 +47,7 @@ def init_state(
         state: ConversationState,
         initial_user_message: Message | None = None,
         on_event: ConversationCallbackType | None = None,
-    ) -> ConversationState:
+    ) -> None:
         # TODO(openhands): we should add test to test this init_state will actually modify state in-place
         messages = state.history.messages
         if len(messages) == 0:
@@ -74,13 +74,12 @@ def init_state(
             if self.env_context and self.env_context.activated_microagents:
                 for microagent in self.env_context.activated_microagents:
                     state.history.microagent_activations.append((microagent.name, len(messages) - 1))
-        return state
 
     def step(
         self,
         state: ConversationState,
         on_event: ConversationCallbackType | None = None,
-    ) -> ConversationState:
+    ) -> None:
         # Get LLM Response (Action)
         _messages = self.llm.format_messages_for_llm(state.history.messages)
         logger.debug(f"Sending messages to LLM: {json.dumps(_messages, indent=2)}")
@@ -102,18 +101,21 @@ def step(
             tool_calls = [tool_call for tool_call in message.tool_calls if tool_call.type == "function"]
             assert len(tool_calls) > 0, "LLM returned tool calls but none are of type 'function'"
             for tool_call in tool_calls:
-                state = self._handle_tool_call(tool_call, state, on_event)
+                self._handle_tool_call(tool_call, state, on_event)
         else:
             logger.info("LLM produced a message response - awaits user input")
             state.agent_finished = True
-        return state
 
     def _handle_tool_call(
         self,
         tool_call: ChatCompletionMessageToolCall,
         state: ConversationState,
         on_event: Callable[[Message | ActionBase | ObservationBase], None] | None = None,
-    ) -> ConversationState:
+    ) -> None:
+        """Handle tool calls from the LLM.
+        
+        NOTE: state will be mutated in-place.
+        """
         assert tool_call.type == "function"
         tool_name = tool_call.function.name
         assert tool_name is not None, "Tool call must have a name"
@@ -124,7 +126,7 @@ def _handle_tool_call(
             logger.error(err)
             state.history.messages.append(Message(role="user", content=[TextContent(text=err)]))
             state.agent_finished = True
-            return state
+            return
 
         # Validate arguments
         try:
@@ -135,7 +137,7 @@ def _handle_tool_call(
             err = f"Error validating args {tool_call.function.arguments} for tool '{tool.name}': {e}"
             logger.error(err)
             state.history.messages.append(Message(role="tool", name=tool.name, tool_call_id=tool_call.id, content=[TextContent(text=err)]))
-            return state
+            return
 
         # Execute actions!
         if tool.executor is None:
@@ -154,4 +156,3 @@ def _handle_tool_call(
         # Set conversation state
         if tool.name == FinishTool.name:
             state.agent_finished = True
-        return state
diff --git a/openhands/core/conversation/conversation.py b/openhands/core/conversation/conversation.py
@@ -3,7 +3,6 @@
 
 if TYPE_CHECKING:
     from openhands.core.agent import AgentBase
-from threading import RLock
 
 from openhands.core.llm import Message
 from openhands.core.logger import get_logger
@@ -39,27 +38,22 @@ def __init__(
         self.max_iteration_per_run = max_iteration_per_run
 
         self.agent = agent
-        self._agent_initialized = False
 
-        # Guarding the conversation state to prevent multiple
-        # writers modify it at the same time
-        self._lock = RLock()
         self.state = ConversationState()
 
     def send_message(self, message: Message) -> None:
         """Sending messages to the agent."""
-        with self._lock:
-            if not self._agent_initialized:
-                # Prepare initial state
-                self.state = self.agent.init_state(
+        with self.state:
+            if not self.state.agent_initialized:
+                # mutate in place; agent must follow this contract
+                self.agent.init_state(
                     self.state,
                     initial_user_message=message,
                     on_event=self._on_event,
                 )
-                self._agent_initialized = True
+                self.state.agent_initialized = True
             else:
-                messages = self.state.history.messages
-                messages.append(message)
+                self.state.history.messages.append(message)
                 if self._on_event:
                     self._on_event(message)
 
@@ -68,8 +62,14 @@ def run(self) -> None:
         iteration = 0
         while not self.state.agent_finished:
             logger.debug(f"Conversation run iteration {iteration}")
-            with self._lock:
-                self.state = self.agent.step(self.state, on_event=self._on_event)
+            # TODO(openhands): we should add a testcase that test IF:
+            # 1. a loop is running
+            # 2. in a separate thread .send_message is called
+            # and check will we be able to execute .send_message
+            # BEFORE the .run loop finishes?
+            with self.state:
+                # step must mutate the SAME state object
+                self.agent.step(self.state, on_event=self._on_event)
             iteration += 1
             if iteration >= self.max_iteration_per_run:
                 break
diff --git a/openhands/core/conversation/state.py b/openhands/core/conversation/state.py