Update README.md
Browse files
README.md
CHANGED
|
@@ -1,10 +1,44 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
| 2 |
license: apache-2.0
|
| 3 |
pipeline_tag: text-generation
|
| 4 |
library_name: transformers
|
| 5 |
tags:
|
| 6 |
-
-
|
|
|
|
| 7 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
|
| 9 |
<p align="center">
|
| 10 |
<img alt="gpt-oss-20b" src="https://raw.githubusercontent.com/openai/gpt-oss/main/docs/gpt-oss-20b.svg">
|
|
@@ -13,7 +47,7 @@ tags:
|
|
| 13 |
<p align="center">
|
| 14 |
<a href="https://gpt-oss.com"><strong>Try gpt-oss</strong></a> ·
|
| 15 |
<a href="https://cookbook.openai.com/topic/gpt-oss"><strong>Guides</strong></a> ·
|
| 16 |
-
<a href="https://openai.com/index/gpt-oss-model-card"><strong>
|
| 17 |
<a href="https://openai.com/index/introducing-gpt-oss/"><strong>OpenAI blog</strong></a>
|
| 18 |
</p>
|
| 19 |
|
|
@@ -21,8 +55,8 @@ tags:
|
|
| 21 |
|
| 22 |
Welcome to the gpt-oss series, [OpenAI’s open-weight models](https://openai.com/open-models) designed for powerful reasoning, agentic tasks, and versatile developer use cases.
|
| 23 |
|
| 24 |
-
We’re releasing two flavors of
|
| 25 |
-
- `gpt-oss-120b` — for production, general purpose, high reasoning use cases that
|
| 26 |
- `gpt-oss-20b` — for lower latency, and local or specialized use cases (21B parameters with 3.6B active parameters)
|
| 27 |
|
| 28 |
Both models were trained on our [harmony response format](https://github.com/openai/harmony) and should only be used with the harmony format as it will not work correctly otherwise.
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- openai/gpt-oss-20b
|
| 4 |
license: apache-2.0
|
| 5 |
pipeline_tag: text-generation
|
| 6 |
library_name: transformers
|
| 7 |
tags:
|
| 8 |
+
- openai
|
| 9 |
+
- unsloth
|
| 10 |
---
|
| 11 |
+
<div>
|
| 12 |
+
<p style="margin-bottom: 0; margin-top: 0;">
|
| 13 |
+
<strong>See <a href="https://huggingface.co/collections/unsloth/gpt-oss-6892433695ce0dee42f31681">our collection</a> for all versions of gpt-oss including GGUF, 4-bit & 16-bit formats.</strong>
|
| 14 |
+
</p>
|
| 15 |
+
<p style="margin-bottom: 0;">
|
| 16 |
+
<em>Learn to run gpt-oss correctly - <a href="https://docs.unsloth.ai/basics/gpt-oss">Read our Guide</a>.</em>
|
| 17 |
+
</p>
|
| 18 |
+
<p style="margin-top: 0;margin-bottom: 0;">
|
| 19 |
+
<em>See <a href="https://docs.unsloth.ai/basics/unsloth-dynamic-v2.0-gguf">Unsloth Dynamic 2.0 GGUFs</a> for our quantization benchmarks.</em>
|
| 20 |
+
</p>
|
| 21 |
+
<div style="display: flex; gap: 5px; align-items: center; ">
|
| 22 |
+
<a href="https://github.com/unslothai/unsloth/">
|
| 23 |
+
<img src="https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png" width="133">
|
| 24 |
+
</a>
|
| 25 |
+
<a href="https://discord.gg/unsloth">
|
| 26 |
+
<img src="https://github.com/unslothai/unsloth/raw/main/images/Discord%20button.png" width="173">
|
| 27 |
+
</a>
|
| 28 |
+
<a href="https://docs.unsloth.ai/basics/gpt-oss">
|
| 29 |
+
<img src="https://raw.githubusercontent.com/unslothai/unsloth/refs/heads/main/images/documentation%20green%20button.png" width="143">
|
| 30 |
+
</a>
|
| 31 |
+
</div>
|
| 32 |
+
<h1 style="margin-top: 0rem;">✨ Read our gpt-oss Guide <a href="https://docs.unsloth.ai/basics/gpt-oss">here</a>!</h1>
|
| 33 |
+
</div>
|
| 34 |
+
|
| 35 |
+
- Read our Blog about gpt-oss support: [unsloth.ai/blog/gpt-oss](https://unsloth.ai/blog/gpt-oss)
|
| 36 |
+
- View the rest of our notebooks in our [docs here](https://docs.unsloth.ai/get-started/unsloth-notebooks).
|
| 37 |
+
- Thank you to the [llama.cpp](https://github.com/ggml-org/llama.cpp) team for their work on supporting this model. We wouldn't be able to release quants without them!
|
| 38 |
+
|
| 39 |
+
The F32 quant is MXFP4 upcasted to BF16 for every single layer and is unquantized.
|
| 40 |
+
|
| 41 |
+
# gpt-oss-20b Details
|
| 42 |
|
| 43 |
<p align="center">
|
| 44 |
<img alt="gpt-oss-20b" src="https://raw.githubusercontent.com/openai/gpt-oss/main/docs/gpt-oss-20b.svg">
|
|
|
|
| 47 |
<p align="center">
|
| 48 |
<a href="https://gpt-oss.com"><strong>Try gpt-oss</strong></a> ·
|
| 49 |
<a href="https://cookbook.openai.com/topic/gpt-oss"><strong>Guides</strong></a> ·
|
| 50 |
+
<a href="https://openai.com/index/gpt-oss-model-card"><strong>System card</strong></a> ·
|
| 51 |
<a href="https://openai.com/index/introducing-gpt-oss/"><strong>OpenAI blog</strong></a>
|
| 52 |
</p>
|
| 53 |
|
|
|
|
| 55 |
|
| 56 |
Welcome to the gpt-oss series, [OpenAI’s open-weight models](https://openai.com/open-models) designed for powerful reasoning, agentic tasks, and versatile developer use cases.
|
| 57 |
|
| 58 |
+
We’re releasing two flavors of the open models:
|
| 59 |
+
- `gpt-oss-120b` — for production, general purpose, high reasoning use cases that fits into a single H100 GPU (117B parameters with 5.1B active parameters)
|
| 60 |
- `gpt-oss-20b` — for lower latency, and local or specialized use cases (21B parameters with 3.6B active parameters)
|
| 61 |
|
| 62 |
Both models were trained on our [harmony response format](https://github.com/openai/harmony) and should only be used with the harmony format as it will not work correctly otherwise.
|