Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 12, 2025

Plan: Fix Example.toDict() serialization for dspy.History objects

Problem

Example.toDict() fails to serialize examples containing dspy.History objects because the convert_to_serializable function doesn't handle Pydantic models properly. This causes JSON serialization failures in Evaluate.save_as_json().

Minimal Changes Plan

  • Explore repository and understand the issue
  • Understand current toDict() implementation in Example class
  • Identify that History is a Pydantic model with model_dump() method
  • Run existing tests to establish baseline
  • Update Example.toDict() to handle Pydantic models
  • Update Evaluate._prepare_results_output() to use toDict()
  • Create test for Example with History serialization
  • Create comprehensive tests for save_as_json/csv with History
  • Run tests to verify fix
  • Verify JSON serialization works end-to-end
  • Fix linting issues
  • Run security scan with codeql_checker - NO VULNERABILITIES FOUND
  • Comprehensive manual verification
  • Address reviewer feedback - consolidated tests into existing file
  • Refactored to move serialization logic into merge_dicts
  • Use isinstance(BaseModel) instead of hasattr for type checking
  • Simplified to always use toDict() without fallback to dict()

Changes Made

1. Updated Example.toDict() in dspy/primitives/example.py

Changed from hasattr(value, "model_dump") to isinstance(value, BaseModel) for more explicit and Pythonic type checking:

try:
    from pydantic import BaseModel
except ImportError:
    BaseModel = None

# In convert_to_serializable:
elif BaseModel is not None and isinstance(value, BaseModel):
    # Handle Pydantic models (e.g., dspy.History)
    return value.model_dump()

This is safer and more explicit than using hasattr(), following Python best practices for type checking.

2. Updated merge_dicts() in dspy/evaluate/evaluate.py

Added automatic serialization when objects have toDict() method:

# Convert to dict if objects have toDict method (e.g., Example objects)
if hasattr(d1, "toDict"):
    d1 = d1.toDict()
if hasattr(d2, "toDict"):
    d2 = d2.toDict()

This keeps the serialization logic centralized in one place and makes _prepare_results_output() cleaner.

3. Updated _prepare_results_output() in dspy/evaluate/evaluate.py

Simplified to always use toDict() without conditional checks:

else example.toDict() | {"prediction": prediction, metric_name: score}

Previously had: (example.toDict() if hasattr(example, "toDict") else dict(example))

Since Example always has toDict() method, the conditional check was unnecessary. This simplification makes the code cleaner and ensures consistent serialization behavior.

4. Added comprehensive tests

  • test_example_to_dict_with_history in tests/primitives/test_example.py - Tests basic History serialization
  • test_evaluate_save_as_json_with_history in tests/evaluate/test_evaluate.py - Tests end-to-end JSON export
  • test_evaluate_save_as_csv_with_history in tests/evaluate/test_evaluate.py - Tests CSV export with History

Testing Results

✅ All 18 tests in tests/primitives/test_example.py pass
✅ All 34 tests in tests/evaluate/test_evaluate.py pass
✅ All linting checks pass (ruff)
✅ Comprehensive manual verification passed
✅ No security vulnerabilities found (CodeQL)

Security Summary

CodeQL analysis completed with 0 alerts for Python code. No security vulnerabilities were found in the changes.

Reviewer Feedback Addressed

  1. Consolidated History-related tests into existing test_evaluate.py file for better code organization
  2. Moved serialization logic to merge_dicts() function for cleaner, more maintainable code
  3. Changed from hasattr(value, "model_dump") to isinstance(value, BaseModel) for more explicit and Pythonic type checking
  4. Simplified _prepare_results_output() to always use toDict() without conditional fallback, making the code cleaner and ensuring toDict() serialization is always used
Original prompt

This section details on the original issue you should resolve

<issue_title>[Bug] Example.toDict fails to serialize examples with dspy.History</issue_title>
<issue_description>### What happened?

I tried to Json serialize an Example dictionary, but it failed. The Example.toDict does not convert the History to anything.
I tried this in the save_as_json method in Evaluate.

Steps to reproduce

use the save_as_json in the Evaluate function with an Example that contains a dspy.History object.

DSPy version

3.0.4</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix Example.toDict serialization issue with dspy.History fix(dspy): Example.toDict() fails to serialize dspy.History objects Nov 12, 2025
Copilot AI requested a review from TomeHirata November 12, 2025 04:20
Copilot finished work on behalf of TomeHirata November 12, 2025 04:20
@crawftv
Copy link
Contributor

crawftv commented Nov 12, 2025

The LLM beat me to it.

return [
(
merge_dicts(example, prediction) | {metric_name: score}
merge_dicts(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot do we still need this change after adding changes to Example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need both changes. While Example.toDict() now serializes History correctly, dict(example) and example.items() still return raw History objects. The original code used merge_dicts(example, prediction) which calls .items() internally, so it would still fail JSON serialization without this change.

Without the evaluate.py change: merge_dicts(example, prediction) → fails with "Object of type History is not JSON serializable"

With both changes: merge_dicts(example.toDict(), prediction) → succeeds because History is already serialized to dict

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot I see, can we change merge_dict to call toDict if needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in ebb502c. Moved the toDict() call into merge_dicts(), which now automatically serializes objects with toDict() method. This keeps the serialization logic centralized and makes _prepare_results_output() cleaner - it now passes Example objects directly to merge_dicts().

from dspy.utils.dummies import DummyLM


def test_evaluate_save_as_json_with_history():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Can we add tests to existing test_evaluate.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in c67ba33. Moved both History tests into test_evaluate.py and removed the separate test_evaluate_with_history.py file. All 34 tests pass.

Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>
Copilot finished work on behalf of TomeHirata November 13, 2025 04:17
@TomeHirata TomeHirata marked this pull request as ready for review November 14, 2025 06:38
Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>
Copilot finished work on behalf of TomeHirata November 14, 2025 06:46
def convert_to_serializable(value):
if hasattr(value, "toDict"):
return value.toDict()
elif hasattr(value, "model_dump"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot can we use isinstance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 2a82f8d. Changed from hasattr(value, "model_dump") to isinstance(value, BaseModel) for more explicit type checking. Also added safe import handling with BaseModel = None fallback if pydantic is not available.

merge_dicts(example, prediction) | {metric_name: score}
if prediction_is_dictlike(prediction)
else dict(example) | {"prediction": prediction, metric_name: score}
else (example.toDict() if hasattr(example, "toDict") else dict(example)) | {"prediction": prediction, metric_name: score}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot is it possible to call toDict when dict() is called?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 1d56c29. Simplified the code to always use example.toDict() instead of the conditional (example.toDict() if hasattr(example, "toDict") else dict(example)). Since Example always has the toDict() method, the conditional was unnecessary. This ensures consistent serialization behavior and makes the code cleaner.

Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>
Copilot finished work on behalf of TomeHirata November 14, 2025 07:03
Copilot AI requested a review from TomeHirata November 14, 2025 07:03
Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>
Copilot finished work on behalf of TomeHirata November 14, 2025 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Example.toDict fails to serialize examples with dspy.History

3 participants