Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 7, 2025

Fix SSE streaming format to comply with specification

  • Understand the current streaming implementation and identify issues
  • Fix SSE format in chat.go to explicitly use proper formatting
  • Fix SSE format in completion.go to explicitly use proper formatting
  • Ensure consistency across all streaming responses
  • Verify changes with format validation tests
  • Run code quality checks (go vet, gofmt)
  • Address code review feedback for error handling
  • Fix finish_reason to use pointer for proper null handling
  • Remove empty content from initial streaming chunks
  • Ensure proper stream termination with finish_reason
  • Improve code consistency
  • Move common finish reasons to constants
  • Ensure finish_reason is always present (null or string value)

Complete Solution

This PR fixes all streaming response issues causing OpenAI API client failures:

SSE Format Fixes ✅

  1. Explicit SSE formatting: data: {json}\n\n with proper %s specifier
  2. Error message streaming: Proper JSON responses with correct SSE format
  3. Robust error handling: Explicit marshal error checking with fallbacks

OpenAI Compliance Fixes ✅

  1. finish_reason always present: Changed to *string WITHOUT omitempty

    • Streaming chunks: "finish_reason": null (explicitly present)
    • Final chunk: "finish_reason": "stop" (with value)
    • Matches OpenAI API specification exactly
  2. Empty initial chunks: Initial message now only sends role

    • Before: {"delta": {"role": "assistant", "content": ""}}
    • After: {"delta": {"role": "assistant"}}
  3. Proper stream termination: Final chunk has empty delta with finish_reason

    • Before: {"delta": {"content": ""}, "finish_reason": "stop"}
    • After: {"delta": {}, "finish_reason": "stop"} + data: [DONE]

Code Quality Improvements ✅

  1. Constants for finish reasons: Centralized constants prevent typos and improve maintainability
    • FinishReasonStop = "stop"
    • FinishReasonToolCalls = "tool_calls"
    • FinishReasonFunctionCall = "function_call"

Streaming Flow (OpenAI Compliant)

// 1. Initial chunk - role with null finish_reason
data: {"choices": [{"index": 0, "delta": {"role": "assistant"}, "finish_reason": null}]}

// 2. Content chunks - content with null finish_reason
data: {"choices": [{"index": 0, "delta": {"content": "Hello"}, "finish_reason": null}]}
data: {"choices": [{"index": 0, "delta": {"content": " world"}, "finish_reason": null}]}

// 3. Final chunk - empty delta with finish_reason value
data: {"choices": [{"index": 0, "delta": {}, "finish_reason": "stop"}]}
data: [DONE]

This matches the OpenAI API specification example:

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"role":"assistant"},"index":0,"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"1"},"index":0,"finish_reason":null}]}

Files Modified

  • core/schema/openai.go: Removed omitempty from FinishReason field
  • core/http/endpoints/openai/constants.go: New file with finish reason constants
  • core/http/endpoints/openai/chat.go: Streaming fixes + explicit nil for finish_reason + use constants
  • core/http/endpoints/openai/completion.go: Streaming fixes + explicit nil for finish_reason + use constants
  • core/http/endpoints/openai/realtime.go: Consistency updates + use constants

Validation

✅ Passes go vet and gofmt
✅ No security vulnerabilities (CodeQL)
✅ OpenAI API specification compliant
✅ Backward compatible for non-streaming responses
finish_reason now always present (null or string value)

Fixes #7180


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@netlify
Copy link

netlify bot commented Nov 7, 2025

Deploy Preview for localai ready!

Name Link
🔨 Latest commit 22333d3
🔍 Latest deploy log https://app.netlify.com/projects/localai/deploys/6910fe7e6056e60008217314
😎 Deploy Preview https://deploy-preview-7182--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copilot AI and others added 3 commits November 7, 2025 19:14
- Replace json.Encoder with json.Marshal for explicit formatting
- Use explicit \n\n for all SSE messages (instead of relying on implicit newlines)
- Change %v to %s format specifier for proper string formatting
- Fix error message streaming to include proper SSE format
- Ensure consistency between chat.go and completion.go endpoints

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
- Handle json.Marshal errors explicitly in error response paths
- Add fallback simple error message if marshal fails
- Prevents sending 'data: <nil>' on marshal failures
- Addresses code review feedback

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Copilot AI changed the title [WIP] Address incongruences in streaming responses Fix SSE streaming format to comply with specification Nov 7, 2025
Copilot AI requested a review from mudler November 7, 2025 19:24
Copilot finished work on behalf of mudler November 7, 2025 19:24
@mudler
Copy link
Owner

mudler commented Nov 7, 2025

@copilot try to address additionally also these points:

  • Empty finish_reason: LocalAI sends "finish_reason": "" instead of "stop" or null
  • Empty initial chunks: Sends empty content chunks that cause streaming clients to terminate early
  • Missing proper stream termination: Doesn't properly signal completion to OpenCode's processor

- Change FinishReason from string to *string in Choice schema
- Streaming chunks now omit finish_reason (null) instead of empty string
- Final chunks properly set finish_reason to "stop", "tool_calls", etc.
- Remove empty content from initial streaming chunks (only send role)
- Final streaming chunk sends empty delta with finish_reason
- Addresses OpenAI API compliance issues causing client failures

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Nov 7, 2025

@copilot try to address additionally also these points:

  • Empty finish_reason: LocalAI sends "finish_reason": "" instead of "stop" or null
    ...

All three issues have been addressed in commit deefcb0:

  1. Empty finish_reason: Changed schema to use *string with omitempty - streaming chunks now omit the field (null) while final chunks properly set it to "stop"

  2. Empty initial chunks: Initial message now only sends role without content field - removed Content: &textContentToReturn which was empty

  3. Missing proper stream termination: Final chunk now sends empty delta {} with finish_reason, followed by data: [DONE] per OpenAI spec

The streaming format now matches OpenAI's specification:

  • Initial: {"delta": {"role": "assistant"}}
  • Chunks: {"delta": {"content": "text"}} (no finish_reason)
  • Final: {"delta": {}, "finish_reason": "stop"} + data: [DONE]

- Use consistent pattern: declare variable then take address
- Remove inline anonymous function for better readability
- Addresses code review feedback

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
}
log.Error().Msgf("Stream ended with error: %v", err)

stopReason := "stop"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot move common finish reasons to constants

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 9c0c1ae. Created constants.go with:

  • FinishReasonStop = "stop"
  • FinishReasonToolCalls = "tool_calls"
  • FinishReasonFunctionCall = "function_call"

All string literals replaced with these constants across chat.go, completion.go, and realtime.go.

Copilot AI requested a review from mudler November 7, 2025 20:48
Copilot finished work on behalf of mudler November 7, 2025 20:48
- Create constants.go with FinishReasonStop, FinishReasonToolCalls, FinishReasonFunctionCall
- Replace all string literals with constants in chat.go, completion.go, realtime.go
- Improves code maintainability and prevents typos

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Copilot finished work on behalf of mudler November 7, 2025 21:15
@github-actions github-actions bot added the kind/documentation Improvements or additions to documentation label Nov 9, 2025
if marshalErr != nil {
log.Error().Msgf("Failed to marshal error response: %v", marshalErr)
// Send a simple error message as fallback
w.WriteString("data: {\"error\":\"Internal error\"}\n\n")

Check warning

Code scanning / gosec

Errors unhandled Warning

Errors unhandled
// Send a simple error message as fallback
w.WriteString("data: {\"error\":\"Internal error\"}\n\n")
} else {
w.WriteString(fmt.Sprintf("data: %s\n\n", respData))

Check warning

Code scanning / gosec

Errors unhandled Warning

Errors unhandled
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
@mudler mudler changed the title Fix SSE streaming format to comply with specification fix(api): SSE streaming format to comply with specification Nov 9, 2025
@mudler
Copy link
Owner

mudler commented Nov 9, 2025

@copilot we shouldn't omit finish_reason, it should be either having a value string (e.g. "stop") or null, see this example:

data: {"id":"chatcmpl-blablabla","object":"chat.completion.chunk","created":1684665888,"model":"gpt-4-0314","choices":[{"delta":{"role":"assistant"},"index":0,"finish_reason":null}]}

data: {"id":"chatcmpl-blablabla","object":"chat.completion.chunk","created":1684665888,"model":"gpt-4-0314","choices":[{"delta":{"content":"1"},"index":0,"finish_reason":null}]}

data: {"id":"chatcmpl-blablabla","object":"chat.completion.chunk","created":1684665888,"model":"gpt-4-0314","choices":[{"delta":{"content":" "},"index":0,"finish_reason":null}]}

data: {"id":"chatcmpl-blablabla","object":"chat.completion.chunk","created":1684665888,"model":"gpt-4-0314","choices":[{"delta":{"content":"2"},"index":0,"finish_reason":null}]}

- Remove omitempty from FinishReason field in Choice struct
- Explicitly set FinishReason to nil for all streaming chunks
- Ensures finish_reason appears as null in JSON for streaming chunks
- Final chunks still properly set finish_reason to "stop", "tool_calls", etc.
- Complies with OpenAI API specification example

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Nov 9, 2025

@copilot we shouldn't omit finish_reason, it should be either having a value string (e.g. "stop") or null, see this example:

...

Fixed in commit 1bdc171. Removed omitempty from FinishReason field and explicitly set it to nil for all streaming chunks. Now the field always appears with either null or a string value like "stop", matching the OpenAI spec example you provided:

// Streaming chunks now have:
"finish_reason": null

// Final chunk has:
"finish_reason": "stop"

Copilot finished work on behalf of mudler November 9, 2025 20:45
@mudler mudler marked this pull request as ready for review November 9, 2025 20:49
@mudler mudler enabled auto-merge (squash) November 9, 2025 20:50
@mudler mudler requested a review from Copilot November 9, 2025 20:52
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the FinishReason field in OpenAI API responses to use a pointer type and introduces constants for finish reason values. The changes improve API consistency and error handling in streaming responses.

Key changes:

  • Changed FinishReason from string to *string in the Choice struct for proper OpenAI API compliance
  • Introduced constants (FinishReasonStop, FinishReasonToolCalls, FinishReasonFunctionCall) to replace string literals
  • Improved streaming error handling with structured JSON responses
  • Removed unused imports (bytes) and cleaned up Go module dependencies

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
core/schema/openai.go Changed FinishReason field from string to *string for nullable support
core/http/endpoints/openai/constants.go Added new file with finish reason constants
core/http/endpoints/openai/chat.go Updated to use pointer type and constants; improved streaming error handling
core/http/endpoints/openai/completion.go Updated to use pointer type and constants; improved streaming error handling
core/http/endpoints/openai/realtime.go Updated to use pointer type and constants
docs/go.mod Removed unused Hugo bootstrap dependency
docs/go.sum Cleared checksums for removed dependencies
.github/gallery-agent/go.mod Removed direct yaml.v3 dependency (remains as indirect)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

FinishReason: &finishReason,
Index: 0,
Delta: &schema.Message{Content: &textContentToReturn},
Delta: &schema.Message{},
Copy link

Copilot AI Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The final streaming chunk sends an empty Delta message, but should include the content to maintain consistency. According to OpenAI's streaming specification, the final delta with finish_reason can include an empty Delta object, but earlier in the code (line 93, 147) an initial message with role 'assistant' is sent. Consider whether the final chunk should maintain consistency with the initial message format or truly be empty. If this is intentional API behavior change, it may break client expectations.

Suggested change
Delta: &schema.Message{},
Delta: &schema.Message{Role: "assistant"},

Copilot uses AI. Check for mistakes.
@mudler mudler disabled auto-merge November 9, 2025 21:00
@mudler mudler merged commit 34bc1bd into master Nov 9, 2025
32 checks passed
@mudler mudler deleted the copilot/fix-streaming-responses branch November 9, 2025 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Address streaming responses incongruences

2 participants