grammar
Browse files- app/dist/index.html +0 -0
- app/dist/index.html.gz +2 -2
- app/src/content/article.mdx +24 -25
app/dist/index.html
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
app/dist/index.html.gz
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e6ba9616dba9dfa5cc9ef3815f320ccb912d7e70851a5a4e7888917e1eebce23
|
| 3 |
+
size 64010
|
app/src/content/article.mdx
CHANGED
|
@@ -47,13 +47,15 @@ import modelDebugger from "./assets/image/model_debugger.png";
|
|
| 47 |
|
| 48 |
## Preface
|
| 49 |
|
| 50 |
-
One million lines of `
|
| 51 |
|
| 52 |
-
Built on `PyTorch`,
|
| 53 |
|
| 54 |
This scale presents a monumental engineering challenge.
|
| 55 |
|
| 56 |
-
How do you keep such a ship afloat, made of so many moving, unrelated parts, contributed to by a buzzing hivemind? Especially as the pace of ML research accelerates?
|
|
|
|
|
|
|
| 57 |
We continue to support all new models and expect to do so for the foreseeable future.
|
| 58 |
|
| 59 |
This post dissects the design philosophy that makes this possible. It's the result of an evolution from our older principles, detailed on our previous [philosophy](https://huggingface.co/docs/transformers/en/philosophy) page, as well as its accompanying [blog post from 2022](https://huggingface.co/blog/transformers-design-philosophy). More recently (and we strongly recommend the read) we publish a blog post about [recent upgrades to transformers](https://huggingface.co/blog/faster-transformers), focusing on what makes the library faster today. All of these developments are only made possible thanks to these principles.
|
|
@@ -88,26 +90,27 @@ These principles were not decided in a vacuum. The library _evolved_ towards the
|
|
| 88 |
<li class="tenet">
|
| 89 |
<a id="source-of-truth"></a>
|
| 90 |
<strong>Source of Truth</strong>
|
| 91 |
-
<p>We aim
|
| 92 |
-
<em>This overarching guideline ensures quality and reproducibility across all models in the library
|
| 93 |
</li>
|
| 94 |
|
| 95 |
<li class="tenet">
|
| 96 |
<a id="one-model-one-file"></a>
|
| 97 |
<strong>One Model, One File</strong>
|
| 98 |
<p>All inference and training core logic has to be visible, top‑to‑bottom, to maximize each model's hackability.</p>
|
| 99 |
-
<em>Every model should be
|
| 100 |
</li>
|
|
|
|
| 101 |
<li class="tenet">
|
| 102 |
<a id="code-is-product"></a>
|
| 103 |
-
<strong>Code is Product</strong>
|
| 104 |
-
<p>Optimize for reading,
|
| 105 |
<em>Code quality matters as much as functionality - optimize for human readers, not just computers.</em>
|
| 106 |
</li>
|
| 107 |
<li class="tenet">
|
| 108 |
<a id="standardize-dont-abstract"></a>
|
| 109 |
<strong>Standardize, Don't Abstract</strong>
|
| 110 |
-
<p>If it's model behavior, keep it in the file;
|
| 111 |
<em>Model-specific logic belongs in the model file, not hidden behind abstractions.</em>
|
| 112 |
</li>
|
| 113 |
<li class="tenet">
|
|
@@ -121,14 +124,14 @@ These principles were not decided in a vacuum. The library _evolved_ towards the
|
|
| 121 |
<li class="tenet">
|
| 122 |
<a id="minimal-user-api"></a>
|
| 123 |
<strong>Minimal User API</strong>
|
| 124 |
-
<p>Config, model,
|
| 125 |
<em>Keep the public interface simple and predictable, users should know what to expect.</em>
|
| 126 |
</li>
|
| 127 |
<li class="tenet">
|
| 128 |
<a id="backwards-compatibility"></a>
|
| 129 |
<strong>Backwards Compatibility</strong>
|
| 130 |
<p>Evolve by additive standardization, never break public APIs.</p>
|
| 131 |
-
<p>Any artifact that was once on the hub and
|
| 132 |
<em>Once something is public, it stays public, evolution through addition, not breaking changes.</em>
|
| 133 |
</li>
|
| 134 |
<li class="tenet">
|
|
@@ -158,15 +161,15 @@ def rotate_half(x):
|
|
| 158 |
```
|
| 159 |
|
| 160 |
|
| 161 |
-
We want all models to have self-contained modeling code.
|
| 162 |
|
| 163 |
-
|
| 164 |
|
| 165 |
-
But the LOC count kept creeping up. Each new model copied over hundreds of lines that we considered largely boilerplate, yet, we could not remove them.
|
| 166 |
|
| 167 |
-
We
|
| 168 |
|
| 169 |
-
What
|
| 170 |
|
| 171 |
<Note variant="info">
|
| 172 |
<strong>TL;DR:</strong> Read the code in one place, <Tenet term="one-model-one-file" display="one model, one file." position="top" />. Keep semantics local (<a href="#standardize-dont-abstract">Standardize, Don't Abstract</a>). Allow strategic duplication for end users (<a href="#do-repeat-yourself">DRY*</a>). Keep the public surface minimal and stable (<a href="#minimal-user-api">Minimal API</a>, <a href="#backwards-compatibility">Backwards Compatibility</a>, <a href="#consistent-public-surface">Consistent Surface</a>).
|
|
@@ -486,7 +489,7 @@ Parallelization is specified in the configuration (<code>tp_plan</code>), not th
|
|
| 486 |
|
| 487 |
### <a id="layers-attentions-caches"></a> Layers, attentions and caches
|
| 488 |
|
| 489 |
-
Following the same logic, the _nature_ of attention and per
|
| 490 |
|
| 491 |
|
| 492 |
```python
|
|
@@ -594,11 +597,9 @@ Llama-lineage is a hub; several VLMs remain islands — engineering opportunity
|
|
| 594 |
|
| 595 |
### Many models, but not enough yet, are alike
|
| 596 |
|
| 597 |
-
I
|
| 598 |
|
| 599 |
-
It is interesting, for
|
| 600 |
-
|
| 601 |
-
Yet, we still have a lot of gaps to fill.
|
| 602 |
|
| 603 |
Zoom out below - it's full of models. You can click on a node to see its connections better, or use the text box to search for a model. You can use the [full viewer](https://huggingface.co/spaces/Molbap/transformers-modular-refactor) (tab "timeline", hit "build timeline") for better exploration.
|
| 604 |
|
|
@@ -722,11 +723,9 @@ This is an overall objective: there's no `transformers` without its community.
|
|
| 722 |
|
| 723 |
Having a framework means forcing users into it. It restrains flexibility and creativity, which are the fertile soil for new ideas to grow.
|
| 724 |
|
| 725 |
-
Among the most valuable contributions to `transformers` is of course the addition of new models.
|
| 726 |
-
|
| 727 |
-
These additions are immediately available for other models to use.
|
| 728 |
|
| 729 |
-
|
| 730 |
|
| 731 |
|
| 732 |
<Note variant="info">
|
|
|
|
| 47 |
|
| 48 |
## Preface
|
| 49 |
|
| 50 |
+
One million lines of `Python` code. Through them, the [`transformers`](https://github.com/huggingface/transformers) library supports more than 400 model architectures, from state-of-the-art LLMs and VLMs to specialized models for audio, video, and tables.
|
| 51 |
|
| 52 |
+
Built on `PyTorch`, it's a foundational tool for modern LLM usage, research, education, and tens of thousands of other open-source projects. Each AI model is added by the community, harmonized into a consistent interface, and tested daily on a CI to ensure reproducibility.
|
| 53 |
|
| 54 |
This scale presents a monumental engineering challenge.
|
| 55 |
|
| 56 |
+
How do you keep such a ship afloat, made of so many moving, unrelated parts, contributed to by a buzzing hivemind? Especially as the pace of ML research accelerates?
|
| 57 |
+
|
| 58 |
+
We receive constant feedback on everything from function signatures with hundreds of arguments to duplicated code and optimization concerns, and we listen to all of it, or try to. The library's usage keeps on growing, and we are a small team of maintainers and contributors, backed by hundreds of open-source community members.
|
| 59 |
We continue to support all new models and expect to do so for the foreseeable future.
|
| 60 |
|
| 61 |
This post dissects the design philosophy that makes this possible. It's the result of an evolution from our older principles, detailed on our previous [philosophy](https://huggingface.co/docs/transformers/en/philosophy) page, as well as its accompanying [blog post from 2022](https://huggingface.co/blog/transformers-design-philosophy). More recently (and we strongly recommend the read) we publish a blog post about [recent upgrades to transformers](https://huggingface.co/blog/faster-transformers), focusing on what makes the library faster today. All of these developments are only made possible thanks to these principles.
|
|
|
|
| 90 |
<li class="tenet">
|
| 91 |
<a id="source-of-truth"></a>
|
| 92 |
<strong>Source of Truth</strong>
|
| 93 |
+
<p>We aim be the [source of truth for all model definitions](https://huggingface.co/blog/transformers-model-definition). This is not a tenet, but something that guides our decisions. Model implementations should be reliable, reproducible, and faithful to the original performances.</p>
|
| 94 |
+
<em>This overarching guideline ensures quality and reproducibility across all models in the library.</em>
|
| 95 |
</li>
|
| 96 |
|
| 97 |
<li class="tenet">
|
| 98 |
<a id="one-model-one-file"></a>
|
| 99 |
<strong>One Model, One File</strong>
|
| 100 |
<p>All inference and training core logic has to be visible, top‑to‑bottom, to maximize each model's hackability.</p>
|
| 101 |
+
<em>Every model should be understandable and hackable by reading a single file from top to bottom.</em>
|
| 102 |
</li>
|
| 103 |
+
|
| 104 |
<li class="tenet">
|
| 105 |
<a id="code-is-product"></a>
|
| 106 |
+
<strong>Code is the Product</strong>
|
| 107 |
+
<p>Optimize for reading, diff-ing, and tweaking, our users are power users. Variables can be explicit, full words, even several words, readability is primordial.</p>
|
| 108 |
<em>Code quality matters as much as functionality - optimize for human readers, not just computers.</em>
|
| 109 |
</li>
|
| 110 |
<li class="tenet">
|
| 111 |
<a id="standardize-dont-abstract"></a>
|
| 112 |
<strong>Standardize, Don't Abstract</strong>
|
| 113 |
+
<p>If it's model behavior, keep it in the file; abstractions are only for generic infra.</p>
|
| 114 |
<em>Model-specific logic belongs in the model file, not hidden behind abstractions.</em>
|
| 115 |
</li>
|
| 116 |
<li class="tenet">
|
|
|
|
| 124 |
<li class="tenet">
|
| 125 |
<a id="minimal-user-api"></a>
|
| 126 |
<strong>Minimal User API</strong>
|
| 127 |
+
<p>Config, model, pre-processing; from_pretrained, save_pretrained, push_to_hub. We want the least amount of codepaths. Reading should be obvious, configurations should be obvious.</p>
|
| 128 |
<em>Keep the public interface simple and predictable, users should know what to expect.</em>
|
| 129 |
</li>
|
| 130 |
<li class="tenet">
|
| 131 |
<a id="backwards-compatibility"></a>
|
| 132 |
<strong>Backwards Compatibility</strong>
|
| 133 |
<p>Evolve by additive standardization, never break public APIs.</p>
|
| 134 |
+
<p>Any artifact that was once on the hub and worked with transformers should be usable indefinitely with the same interface. Further, public methods should not change to avoid breaking dependencies.</p>
|
| 135 |
<em>Once something is public, it stays public, evolution through addition, not breaking changes.</em>
|
| 136 |
</li>
|
| 137 |
<li class="tenet">
|
|
|
|
| 161 |
```
|
| 162 |
|
| 163 |
|
| 164 |
+
We want all models to have self-contained modeling code.
|
| 165 |
|
| 166 |
+
Each core functionality _must_ be in the modeling code, every non-core functionality _can_ be outside of it.
|
| 167 |
|
| 168 |
+
This comes as a great cost. Enter the `#Copied from...` mechanism: for a long time, these comments were indicating that some code was copied from another model, saving time both for the reviewers and for the CI. But the LOC count kept creeping up. Each new model copied over hundreds of lines that we considered largely boilerplate, yet, we could not remove them.
|
| 169 |
|
| 170 |
+
We need to separate both principles that were so far intertwined, <Tenet term="do-repeat-yourself" display="repetition" position="top" /> and <Tenet term="one-model-one-file" display="hackabilty" position="top" />.
|
| 171 |
|
| 172 |
+
What's the solution to this?
|
| 173 |
|
| 174 |
<Note variant="info">
|
| 175 |
<strong>TL;DR:</strong> Read the code in one place, <Tenet term="one-model-one-file" display="one model, one file." position="top" />. Keep semantics local (<a href="#standardize-dont-abstract">Standardize, Don't Abstract</a>). Allow strategic duplication for end users (<a href="#do-repeat-yourself">DRY*</a>). Keep the public surface minimal and stable (<a href="#minimal-user-api">Minimal API</a>, <a href="#backwards-compatibility">Backwards Compatibility</a>, <a href="#consistent-public-surface">Consistent Surface</a>).
|
|
|
|
| 489 |
|
| 490 |
### <a id="layers-attentions-caches"></a> Layers, attentions and caches
|
| 491 |
|
| 492 |
+
Following the same logic, the _nature_ of attention and caching per layer of a model should not be hardcoded. We should be able to specify in a configuration-based fashion how each layer is implemented. Thus we define a mapping that can be then
|
| 493 |
|
| 494 |
|
| 495 |
```python
|
|
|
|
| 597 |
|
| 598 |
### Many models, but not enough yet, are alike
|
| 599 |
|
| 600 |
+
Next, I looked into Jaccard similarity, which we use to measure set differences. I know that code is more than a set of characters stringed together. I also used code embedding models to check out code similarities, and it yielded better results, for the needs of this blog post I will stick to Jaccard index.
|
| 601 |
|
| 602 |
+
It is interesting, for that, to look at _when_ we deployed this modular logic and what was its rippling effect on the library. You can check the [larger space](https://huggingface.co/spaces/Molbap/transformers-modular-refactor) to play around, but the gist is: adding modular allowed to connect more and more models to solid reference points. We have a lot of gaps to fill in still.
|
|
|
|
|
|
|
| 603 |
|
| 604 |
Zoom out below - it's full of models. You can click on a node to see its connections better, or use the text box to search for a model. You can use the [full viewer](https://huggingface.co/spaces/Molbap/transformers-modular-refactor) (tab "timeline", hit "build timeline") for better exploration.
|
| 605 |
|
|
|
|
| 723 |
|
| 724 |
Having a framework means forcing users into it. It restrains flexibility and creativity, which are the fertile soil for new ideas to grow.
|
| 725 |
|
| 726 |
+
Among the most valuable contributions to `transformers` is of course the addition of new models. Recently, [OpenAI added GPT-OSS](https://huggingface.co/blog/welcome-openai-gpt-oss), which prompted the addition of many new features to the library in order to support [their model](https://huggingface.co/openai/gpt-oss-120b).
|
|
|
|
|
|
|
| 727 |
|
| 728 |
+
A second one is the ability to fine-tune and pipeline these models into many other softwares. Check here on the hub how many finetunes are registered for [gpt-oss 120b](https://huggingface.co/models?other=base_model:finetune:openai/gpt-oss-120b), despite its size!
|
| 729 |
|
| 730 |
|
| 731 |
<Note variant="info">
|