@@ -137,8 +137,18 @@ def create(
137137
138138 [See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/gpt/parameter-details)
139139
140- response_format: An object specifying the format that the model must output. Used to enable JSON
141- mode.
140+ response_format: An object specifying the format that the model must output.
141+
142+ Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
143+ message the model generates is valid JSON.
144+
145+ **Important:** when using JSON mode, you **must** also instruct the model to
146+ produce JSON yourself via a system or user message. Without this, the model may
147+ generate an unending stream of whitespace until the generation reaches the token
148+ limit, resulting in increased latency and appearance of a "stuck" request. Also
149+ note that the message content may be partially cut off if
150+ `finish_reason="length"`, which indicates the generation exceeded `max_tokens`
151+ or the conversation exceeded the max context length.
142152
143153 seed: This feature is in Beta. If specified, our system will make a best effort to
144154 sample deterministically, such that repeated requests with the same `seed` and
@@ -304,8 +314,18 @@ def create(
304314
305315 [See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/gpt/parameter-details)
306316
307- response_format: An object specifying the format that the model must output. Used to enable JSON
308- mode.
317+ response_format: An object specifying the format that the model must output.
318+
319+ Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
320+ message the model generates is valid JSON.
321+
322+ **Important:** when using JSON mode, you **must** also instruct the model to
323+ produce JSON yourself via a system or user message. Without this, the model may
324+ generate an unending stream of whitespace until the generation reaches the token
325+ limit, resulting in increased latency and appearance of a "stuck" request. Also
326+ note that the message content may be partially cut off if
327+ `finish_reason="length"`, which indicates the generation exceeded `max_tokens`
328+ or the conversation exceeded the max context length.
309329
310330 seed: This feature is in Beta. If specified, our system will make a best effort to
311331 sample deterministically, such that repeated requests with the same `seed` and
@@ -464,8 +484,18 @@ def create(
464484
465485 [See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/gpt/parameter-details)
466486
467- response_format: An object specifying the format that the model must output. Used to enable JSON
468- mode.
487+ response_format: An object specifying the format that the model must output.
488+
489+ Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
490+ message the model generates is valid JSON.
491+
492+ **Important:** when using JSON mode, you **must** also instruct the model to
493+ produce JSON yourself via a system or user message. Without this, the model may
494+ generate an unending stream of whitespace until the generation reaches the token
495+ limit, resulting in increased latency and appearance of a "stuck" request. Also
496+ note that the message content may be partially cut off if
497+ `finish_reason="length"`, which indicates the generation exceeded `max_tokens`
498+ or the conversation exceeded the max context length.
469499
470500 seed: This feature is in Beta. If specified, our system will make a best effort to
471501 sample deterministically, such that repeated requests with the same `seed` and
@@ -704,8 +734,18 @@ async def create(
704734
705735 [See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/gpt/parameter-details)
706736
707- response_format: An object specifying the format that the model must output. Used to enable JSON
708- mode.
737+ response_format: An object specifying the format that the model must output.
738+
739+ Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
740+ message the model generates is valid JSON.
741+
742+ **Important:** when using JSON mode, you **must** also instruct the model to
743+ produce JSON yourself via a system or user message. Without this, the model may
744+ generate an unending stream of whitespace until the generation reaches the token
745+ limit, resulting in increased latency and appearance of a "stuck" request. Also
746+ note that the message content may be partially cut off if
747+ `finish_reason="length"`, which indicates the generation exceeded `max_tokens`
748+ or the conversation exceeded the max context length.
709749
710750 seed: This feature is in Beta. If specified, our system will make a best effort to
711751 sample deterministically, such that repeated requests with the same `seed` and
@@ -871,8 +911,18 @@ async def create(
871911
872912 [See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/gpt/parameter-details)
873913
874- response_format: An object specifying the format that the model must output. Used to enable JSON
875- mode.
914+ response_format: An object specifying the format that the model must output.
915+
916+ Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
917+ message the model generates is valid JSON.
918+
919+ **Important:** when using JSON mode, you **must** also instruct the model to
920+ produce JSON yourself via a system or user message. Without this, the model may
921+ generate an unending stream of whitespace until the generation reaches the token
922+ limit, resulting in increased latency and appearance of a "stuck" request. Also
923+ note that the message content may be partially cut off if
924+ `finish_reason="length"`, which indicates the generation exceeded `max_tokens`
925+ or the conversation exceeded the max context length.
876926
877927 seed: This feature is in Beta. If specified, our system will make a best effort to
878928 sample deterministically, such that repeated requests with the same `seed` and
@@ -1031,8 +1081,18 @@ async def create(
10311081
10321082 [See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/gpt/parameter-details)
10331083
1034- response_format: An object specifying the format that the model must output. Used to enable JSON
1035- mode.
1084+ response_format: An object specifying the format that the model must output.
1085+
1086+ Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
1087+ message the model generates is valid JSON.
1088+
1089+ **Important:** when using JSON mode, you **must** also instruct the model to
1090+ produce JSON yourself via a system or user message. Without this, the model may
1091+ generate an unending stream of whitespace until the generation reaches the token
1092+ limit, resulting in increased latency and appearance of a "stuck" request. Also
1093+ note that the message content may be partially cut off if
1094+ `finish_reason="length"`, which indicates the generation exceeded `max_tokens`
1095+ or the conversation exceeded the max context length.
10361096
10371097 seed: This feature is in Beta. If specified, our system will make a best effort to
10381098 sample deterministically, such that repeated requests with the same `seed` and
0 commit comments