Skip to content

Conversation

@devin-ai-integration
Copy link
Contributor

Summary

Fixes issue #4056 where token usage always returns 0 when using async streaming crew kickoff.

Root cause: The streaming completion methods (_handle_streaming_completion and _ahandle_streaming_completion) in OpenAICompletion never called _track_token_usage_internal(), unlike the non-streaming methods which correctly track usage.

Changes:

  • Add stream_options={"include_usage": True} to streaming params so OpenAI API returns usage information in the final chunk
  • Extract and track token usage from the final chunk in both sync and async streaming paths
  • Extract and track token usage from final_completion in response_model streaming paths
  • Add _extract_chunk_token_usage method for ChatCompletionChunk objects
  • Add 3 unit tests to verify streaming token usage tracking

Review & Testing Checklist for Human

  • Manual end-to-end test: Run a streaming crew kickoff with stream=True and verify crew.token_usage, streaming.result.token_usage, and agent._token_process.get_summary() return non-zero values. The unit tests use mocks and don't verify actual API behavior.
  • Verify OpenAI SDK compatibility: Confirm that stream_options={"include_usage": True} works with the installed OpenAI SDK version (~1.83.0). This is a relatively recent feature.
  • Review async response_model path (lines 796-835): This path uses a different pattern than sync - it accumulates content and parses manually rather than using get_final_completion(). Verify usage tracking is correct here.
  • Consider other providers: This fix only addresses OpenAI. Anthropic, Gemini, Azure, and Bedrock providers may have the same issue.

Notes

This commit fixes issue #4056 where token usage always returns 0 when using
async streaming crew kickoff.

Root cause: The streaming completion methods (_handle_streaming_completion and
_ahandle_streaming_completion) in OpenAICompletion never called
_track_token_usage_internal(), unlike the non-streaming methods.

Changes:
- Add stream_options={'include_usage': True} to streaming params so OpenAI API
  returns usage information in the final chunk
- Extract and track token usage from the final chunk in sync streaming
- Extract and track token usage from the final chunk in async streaming
- Extract and track token usage from final_completion in response_model paths
- Add _extract_chunk_token_usage method for ChatCompletionChunk objects
- Add tests to verify streaming token usage tracking works correctly

Co-Authored-By: João <joao@crewai.com>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant