Skip to content
Merged
37 changes: 37 additions & 0 deletions openhands-sdk/openhands/sdk/llm/message.py
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,14 @@ def to_chat_dict(self) -> dict[str, Any]:
# Assistant function_call(s)
if self.role == "assistant" and self.tool_calls:
message_dict["tool_calls"] = [tc.to_chat_dict() for tc in self.tool_calls]
if "content" in message_dict:
normalized_content = self._normalize_tool_call_content(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OpenHands Read this PR diff carefully. First, I don't really like the name "normalize", it means everything so nothing, name this... remove_empty_content_strings or remove_content_if_empty, so it says what it does.

Second, look at the whole file message.py and what we do, maybe it's better to refactor a little bit so we do the removal inside the helper method?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on it! enyst can track my progress at all-hands.dev

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of changes applied to address your comment on openhands-sdk/openhands/sdk/llm/message.py:

What I changed

  • Renamed the helper to a clearer, explicit name:
    • _normalize_tool_call_content → _remove_content_if_empty
  • Refactored to perform the removal inside the helper:
    • The helper now mutates message_dict in-place, removing:
      • content if it’s an empty/whitespace-only string
      • empty “text” items from a list; drops content entirely if the list becomes empty
    • Called from to_chat_dict only when role == "assistant" and tool_calls are present, keeping other flows unchanged

Why

  • The new name is explicit (“remove_content_if_empty”) and avoids vague terminology
  • Encapsulating the removal logic in a helper keeps to_chat_dict simpler and centralizes behavior around the tool-calls case

Checklist

  • Rename is explicit and aligned with your suggestion ✔
  • Removal happens inside the helper, called from to_chat_dict ✔
  • Behavior unchanged for non-tool-call messages ✔
  • Changes are minimal and scoped to a single file ✔
  • Pre-commit (ruff format/lint, pycodestyle, pyright) passed on the file ✔
  • Changes pushed to branch ht/support-kimi-k2-thinking ✔

Notes

  • File changed: openhands-sdk/openhands/sdk/llm/message.py
  • If you prefer the alternative name remove_empty_content_strings, I can switch it—let me know your preference.

View full conversation

message_dict["content"]
)
if normalized_content is None:
message_dict.pop("content", None)
else:
message_dict["content"] = normalized_content

# Tool result (observation) threading
if self.role == "tool" and self.tool_call_id is not None:
Expand Down Expand Up @@ -331,6 +339,35 @@ def _list_serializer(self) -> dict[str, Any]:
# tool call keys are added in to_chat_dict to centralize behavior
return message_dict

def _normalize_tool_call_content(self, content: Any) -> Any | None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move it to a util function outside the LLM class

Also let's improve the return type annotation, should be str | list[dict[str, str]] | None?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that point, I thought so too, but it's actually in Message... 🤔

More importantly, I seem to recall we've been here before... Now we remove the content if it's empty, for all LLMs. I don't remember exactly, but I have a bad feeling that some LLMs may fail without content 😢

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah good point @enyst

I have a bad feeling that some LLMs may fail without content 😢
in this case, maybe we just want to:

  1. add kimi k2 thinking to regression test
  2. and when other LLM breaks this, we fix them and add them to integration tests as well

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OpenHands can you add kimi-k2-thinking to the regression test workflow too? Should be litellm_proxy/kimi-k2-thinking

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on it! xingyaoww can track my progress at all-hands.dev

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

I've successfully added kimi-k2-thinking to the regression test workflow as requested!

Changes Made:

File Modified: .github/workflows/integration-runner.yml

Added a new job configuration to the integration test matrix:

- name: Kimi K2 Thinking
  run-suffix: kimi_k2_run
  llm-config:
      model: litellm_proxy/kimi-k2-thinking

This configuration follows the same pattern as the existing models (Claude Sonnet 4.5, GPT-5 Mini, and Deepseek Chat) and will now run the regression tests against the kimi-k2-thinking model via the LiteLLM proxy.

Commit Details:

  • Commit: a1d7f6d
  • Message: "Add kimi-k2-thinking to regression test workflow"
  • Branch: ht/support-kimi-k2-thinking (already pushed to remote)

The kimi-k2-thinking model will now be included in:

  • Scheduled nightly integration test runs
  • Integration tests triggered via workflow_dispatch
  • Integration tests triggered when the 'integration-test' label is added to a PR

View full conversation

"""Remove empty text payloads from assistant tool call messages."""
if isinstance(content, str):
if content.strip() == "":
return None
return content

if isinstance(content, list):
normalized: list[Any] = []
for item in content:
if not isinstance(item, dict):
normalized.append(item)
continue

if item.get("type") == "text":
text_value = item.get("text", "")
if isinstance(text_value, str):
if text_value.strip() == "":
continue
else:
if str(text_value).strip() == "":
continue

normalized.append(item)

return normalized if normalized else None

return content

def to_responses_value(self, *, vision_enabled: bool) -> str | list[dict[str, Any]]:
"""Return serialized form.

Expand Down
39 changes: 39 additions & 0 deletions tests/sdk/llm/test_message.py
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,45 @@ def test_message_with_tool_calls():
assert result["tool_calls"][0]["function"]["arguments"] == '{"arg": "value"}'


def test_message_tool_calls_drop_empty_string_content():
"""Assistant tool calls with no text should not include empty content strings."""
from openhands.sdk.llm.message import Message, MessageToolCall

tool_call = MessageToolCall(
id="call_empty",
name="test_function",
arguments="{}",
origin="completion",
)

message = Message(role="assistant", content=[], tool_calls=[tool_call])

result = message.to_chat_dict()
assert "content" not in result


def test_message_tool_calls_strip_blank_list_content():
"""List-serialized tool call messages should drop blank text content blocks."""
from openhands.sdk.llm.message import Message, MessageToolCall, TextContent

tool_call = MessageToolCall(
id="call_blank_list",
name="test_function",
arguments="{}",
origin="completion",
)

message = Message(
role="assistant",
content=[TextContent(text="")],
tool_calls=[tool_call],
function_calling_enabled=True,
)

result = message.to_chat_dict()
assert "content" not in result


def test_message_from_llm_chat_message_function_role_error():
"""Test Message.from_llm_chat_message with function role raises error."""
from litellm.types.utils import Message as LiteLLMMessage
Expand Down
Loading