Skip to content

Commit e1fc367

Browse files
authored
docs: Fix typo in model name for gpt-oss-20b (#3920)
1 parent baefdb8 commit e1fc367

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

packages/backend/src/assets/ai.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -292,7 +292,7 @@
292292
},
293293
{
294294
"id": "hf.openai.gpt-oss-20b",
295-
"name": "openai/gtp-oss-20b (Unsloth quantization)",
295+
"name": "openai/gpt-oss-20b (Unsloth quantization)",
296296
"description": "\r\n# Welcome to the gpt-oss series, [OpenAI’s open-weight models](https://openai.com/open-models) designed for powerful reasoning, agentic tasks, and versatile developer use cases.\r\n\r\nWe’re releasing two flavors of the open models:\r\n- `gpt-oss-120b` — for production, general purpose, high reasoning use cases that fits into a single H100 GPU (117B parameters with 5.1B active parameters)\r\n- `gpt-oss-20b` — for lower latency, and local or specialized use cases (21B parameters with 3.6B active parameters)\r\n\r\nBoth models were trained on our [harmony response format](https://github.com/openai/harmony) and should only be used with the harmony format as it will not work correctly otherwise.\r\n\r\n> [!NOTE]\r\n> This model card is dedicated to the smaller `gpt-oss-20b` model. Check out [`gpt-oss-120b`](https://huggingface.co/openai/gpt-oss-120b) for the larger model.\r\n\r\n# Highlights\r\n\r\n* **Permissive Apache 2.0 license:** Build freely without copyleft restrictions or patent risk—ideal for experimentation, customization, and commercial deployment.\r\n* **Configurable reasoning effort:** Easily adjust the reasoning effort (low, medium, high) based on your specific use case and latency needs.\r\n* **Full chain-of-thought:** Gain complete access to the model’s reasoning process, facilitating easier debugging and increased trust in outputs. It’s not intended to be shown to end users.\r\n* **Fine-tunable:** Fully customize models to your specific use case through parameter fine-tuning.\r\n* **Agentic capabilities:** Use the models’ native capabilities for function calling, [web browsing](https://github.com/openai/gpt-oss/tree/main?tab=readme-ov-file#browser), [Python code execution](https://github.com/openai/gpt-oss/tree/main?tab=readme-ov-file#python), and Structured Outputs.\r\n* **Native MXFP4 quantization:** The models are trained with native MXFP4 precision for the MoE layer, making `gpt-oss-120b` run on a single H100 GPU and the `gpt-oss-20b` model run within 16GB of memory.\r\n\r\n---\r\n\r\n# Inference examples\r\n\r\n## Transformers\r\nYou can use `gpt-oss-120b` and `gpt-oss-20b` with Transformers. If you use the Transformers chat template, it will automatically apply the [harmony response format](https://github.com/openai/harmony). If you use `model.generate` directly, you need to apply the harmony format manually using the chat template or use our [openai-harmony](https://github.com/openai/harmony) package.\r\n\r\nTo get started, install the necessary dependencies:\r\n```\r\npip install -U transformers kernels torch \r\n```\r\n\r\n```py\r\nfrom transformers import pipeline\r\nimport torch\r\n\r\nmodel_id = \"openai/gpt-oss-20b\"\r\n\r\npipe = pipeline(\r\n \"text-generation\",\r\n model=model_id,\r\n torch_dtype=\"auto\",\r\n device_map=\"auto\",\r\n)\r\n\r\nmessages = [\r\n {\"role\": \"user\", \"content\": \"Explain quantum mechanics clearly and concisely.\"},\r\n]\r\n\r\noutputs = pipe(\r\n messages,\r\n max_new_tokens=256,\r\n)\r\nprint(outputs[0][\"generated_text\"][-1])\r\n```\r\n\r\n## vLLM\r\nvLLM recommends using [uv](https://docs.astral.sh/uv/) for Python dependency management. You can spin up an OpenAI-compatible webserver:\r\n```\r\nuv pip install --pre vllm==0.10.1+gptoss \\\r\n --extra-index-url https://wheels.vllm.ai/gpt-oss/ \\\r\n --extra-index-url https://download.pytorch.org/whl/nightly/cu128 \\\r\n --index-strategy unsafe-best-match\r\n\r\nvllm serve openai/gpt-oss-20b\r\n```\r\n\r\n## PyTorch / Triton\r\nSee [reference implementations](https://github.com/openai/gpt-oss?tab=readme-ov-file#reference-pytorch-implementation).\r\n\r\n## Ollama\r\n```bash\r\n# gpt-oss-20b\r\nollama pull gpt-oss:20b\r\nollama run gpt-oss:20b\r\n```\r\n\r\n## LM Studio\r\n```bash\r\n# gpt-oss-20b\r\nlms get openai/gpt-oss-20b\r\n```\r\n\r\n# Download the model\r\n```bash\r\n# gpt-oss-20b\r\nhuggingface-cli download openai/gpt-oss-20b --include \"original/*\" --local-dir gpt-oss-20b/\npip install gpt-oss\npython -m gpt_oss.chat model/\r\n```\r\n\r\n# Reasoning levels\r\n* **Low:** Fast responses for general dialogue.\r\n* **Medium:** Balanced speed and detail.\r\n* **High:** Deep and detailed analysis.\r\n\r\n# Tool use\r\n* Web browsing (built-in tools)\r\n* Function calling with schemas\r\n* Agentic operations\r\n\r\n# Fine-tuning\r\nThe smaller model `gpt-oss-20b` can be fine-tuned on consumer hardware, larger `gpt-oss-120b` can be fine-tuned on a single H100 node.",
297297
"registry": "Hugging Face",
298298
"license": "Apache-2.0",

0 commit comments

Comments
 (0)