Spaces:

transformers-community
/

Transformers-tenets

Running

App Files Files Community

Molbap HF Staff commited on Oct 6

Commit

530e759

1 Parent(s): 165284f

grammar

Browse files

Files changed (3) hide show

app/dist/index.html +0 -0
app/dist/index.html.gz +2 -2
app/src/content/article.mdx +24 -25

app/dist/index.html CHANGED Viewed

The diff for this file is too large to render. See raw diff

app/dist/index.html.gz CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f073b768a6cf51ffeccf975a1448aa1ecf36a3fd42daf0626356f5bdb01597f4
-size 64292

 version https://git-lfs.github.com/spec/v1
+oid sha256:e6ba9616dba9dfa5cc9ef3815f320ccb912d7e70851a5a4e7888917e1eebce23
+size 64010

app/src/content/article.mdx CHANGED Viewed

@@ -47,13 +47,15 @@ import modelDebugger from "./assets/image/model_debugger.png";
 ## Preface
-One million lines of `python` code. Through them, the `transformers` library supports more than 400 model architectures, from state-of-the-art LLMs and VLMs to specialized models for audio, video, and tables.
-Built on `PyTorch`, transformers is a foundational tool for modern LLM usage, research, education, and tens of thousands of other open-source projects. Each AI model is added by the community, harmonized into a consistent interface, and tested daily on a CI to ensure reproducibility.
 This scale presents a monumental engineering challenge.
-How do you keep such a ship afloat, made of so many moving, unrelated parts, contributed to by a buzzing hivemind? Especially as the pace of ML research accelerates? We receive constant feedback on everything from function signatures with hundreds of arguments to duplicated code and optimization concerns, and we listen to all of it, or try to. The library's usage keeps on growing, and we are a small team of maintainers and contributors, backed by hundreds of open-source community members.
 We continue to support all new models and expect to do so for the foreseeable future.
 This post dissects the design philosophy that makes this possible. It's the result of an evolution from our older principles, detailed on our previous [philosophy](https://huggingface.co/docs/transformers/en/philosophy) page, as well as its accompanying [blog post from 2022](https://huggingface.co/blog/transformers-design-philosophy). More recently (and we strongly recommend the read) we publish a blog post about [recent upgrades to transformers](https://huggingface.co/blog/faster-transformers), focusing on what makes the library faster today. All of these developments are only made possible thanks to these principles.
@@ -88,26 +90,27 @@ These principles were not decided in a vacuum. The library _evolved_ towards the
 <li class="tenet">
 <a id="source-of-truth"></a>
 <strong>Source of Truth</strong>
-<p>We aim to be a [source of truth for all model definitions](https://huggingface.co/blog/transformers-model-definition). This is more of a goal than a tenet, but it strongly guides our decisions. Model implementations should be reliable, reproducible, and faithful to the original implementations. If we are successful, they should become reference baselines for the ecosystem, so they'll be easily adopted by downstream libraries and projects. It's much easier for a project to always refer to the transformers implementation, than to learn a different research codebase every time a new architecture is released.</p>
-<em>This overarching guideline ensures quality and reproducibility across all models in the library, and aspires to make the community work easier.</em>
 </li>
 <li class="tenet">
 <a id="one-model-one-file"></a>
 <strong>One Model, One File</strong>
 <p>All inference and training core logic has to be visible, top‑to‑bottom, to maximize each model's hackability.</p>
-<em>Every model should be completely understandable and hackable by reading a single file from top to bottom.</em>
 </li>
 <li class="tenet">
 <a id="code-is-product"></a>
-<strong>Code is Product</strong>
-<p>Optimize for reading, diffing, and tweaking, our users are power users. Variables should be explicit, full words, even several words, readability is primordial.</p>
 <em>Code quality matters as much as functionality - optimize for human readers, not just computers.</em>
 </li>
 <li class="tenet">
 <a id="standardize-dont-abstract"></a>
 <strong>Standardize, Don't Abstract</strong>
-<p>If it's model behavior, keep it in the file; use abstractions only for generic infra.</p>
 <em>Model-specific logic belongs in the model file, not hidden behind abstractions.</em>
 </li>
 <li class="tenet">
@@ -121,14 +124,14 @@ These principles were not decided in a vacuum. The library _evolved_ towards the
 <li class="tenet">
 <a id="minimal-user-api"></a>
 <strong>Minimal User API</strong>
-<p>Config, model, preprocessing; from_pretrained, save_pretrained, push_to_hub. We want the least amount of codepaths. Reading should be obvious, configurations should be obvious.</p>
 <em>Keep the public interface simple and predictable, users should know what to expect.</em>
 </li>
 <li class="tenet">
 <a id="backwards-compatibility"></a>
 <strong>Backwards Compatibility</strong>
 <p>Evolve by additive standardization, never break public APIs.</p>
-<p>Any artifact that was once on the hub and loadable with transformers should be usable indefinitely with the same interface. Further, public methods should not change to avoid breaking dependencies. If we do deprecate something, it's with very long cycles beforehand.</p>
 <em>Once something is public, it stays public, evolution through addition, not breaking changes.</em>
 </li>
 <li class="tenet">
@@ -158,15 +161,15 @@ def rotate_half(x):
 ```
-We want all models to have self-contained modeling code. Every core functionality _must_ be in the modeling code, every non-core functionality _can_ be outside of it.
-This comes at a great cost. For years, we have used what we call the `#Copied from...` mechanism: we added comments of a specific format documenting that some code was copied from another model, saving time both for the reviewers and for the CI: we had tooling to ensure that the copied blocks remained in sync.
-But the LOC count kept creeping up. Each new model copied over hundreds of lines that we considered largely boilerplate, yet, we could not remove them.
-We needed to separate two principles that were so far intertwined, <Tenet term="do-repeat-yourself" display="repetition" position="top" /> and <Tenet term="one-model-one-file" display="hackability" position="top" />.
-What is the solution to this? Let's talk about modular transformers.
 <Note variant="info">
 <strong>TL;DR:</strong> Read the code in one place, <Tenet term="one-model-one-file" display="one model, one file." position="top" />. Keep semantics local (<a href="#standardize-dont-abstract">Standardize, Don't Abstract</a>). Allow strategic duplication for end users (<a href="#do-repeat-yourself">DRY*</a>). Keep the public surface minimal and stable (<a href="#minimal-user-api">Minimal API</a>, <a href="#backwards-compatibility">Backwards Compatibility</a>, <a href="#consistent-public-surface">Consistent Surface</a>).
@@ -486,7 +489,7 @@ Parallelization is specified in the configuration (<code>tp_plan</code>), not th
 ### <a id="layers-attentions-caches"></a> Layers, attentions and caches
-Following the same logic, the _nature_ of attention and per-layer caching should not be hardcoded. We should be able to specify in the configuration how each layer is implemented. Thus, we define a mapping like:
 ```python
@@ -594,11 +597,9 @@ Llama-lineage is a hub; several VLMs remain islands — engineering opportunity
 ### Many models, but not enough yet, are alike
-I look into Jaccard similarity, which we use to measure set differences, to find similarities across models. I know that code is more than a set of characters stringed together. We also try code-embedding models that rank candidates better in practice, but for this post we stick to the deterministic Jaccard index.
-It is interesting, for our comparison, to look at _when_ we deployed the modular logic and what was its rippling effect on the library. Looking at the timeline makes it obvious: adding modular allowed to connect more and more models to solid reference points.
-Yet, we still have a lot of gaps to fill.
 Zoom out below - it's full of models. You can click on a node to see its connections better, or use the text box to search for a model. You can use the [full viewer](https://huggingface.co/spaces/Molbap/transformers-modular-refactor) (tab "timeline", hit "build timeline") for better exploration.
@@ -722,11 +723,9 @@ This is an overall objective: there's no `transformers` without its community.
 Having a framework means forcing users into it. It restrains flexibility and creativity, which are the fertile soil for new ideas to grow.
-Among the most valuable contributions to `transformers` is of course the addition of new models. Very recently, [OpenAI added GPT-OSS](https://huggingface.co/blog/welcome-openai-gpt-oss), which prompted the addition of many new features to the library in order to support [their model](https://huggingface.co/openai/gpt-oss-120b).
-These additions are immediately available for other models to use.
-Another important advantage is the ability to fine-tune and pipeline these models into many other libraries and tools. Check here on the hub how many finetunes are registered for [gpt-oss 120b](https://huggingface.co/models?other=base_model:finetune:openai/gpt-oss-120b), despite its size!
 <Note variant="info">

 ## Preface
+One million lines of `Python` code. Through them, the [`transformers`](https://github.com/huggingface/transformers) library supports more than 400 model architectures, from state-of-the-art LLMs and VLMs to specialized models for audio, video, and tables.
+Built on `PyTorch`, it's a foundational tool for modern LLM usage, research, education, and tens of thousands of other open-source projects. Each AI model is added by the community, harmonized into a consistent interface, and tested daily on a CI to ensure reproducibility.
 This scale presents a monumental engineering challenge.
+How do you keep such a ship afloat, made of so many moving, unrelated parts, contributed to by a buzzing hivemind? Especially as the pace of ML research accelerates?
+We receive constant feedback on everything from function signatures with hundreds of arguments to duplicated code and optimization concerns, and we listen to all of it, or try to. The library's usage keeps on growing, and we are a small team of maintainers and contributors, backed by hundreds of open-source community members.
 We continue to support all new models and expect to do so for the foreseeable future.
 This post dissects the design philosophy that makes this possible. It's the result of an evolution from our older principles, detailed on our previous [philosophy](https://huggingface.co/docs/transformers/en/philosophy) page, as well as its accompanying [blog post from 2022](https://huggingface.co/blog/transformers-design-philosophy). More recently (and we strongly recommend the read) we publish a blog post about [recent upgrades to transformers](https://huggingface.co/blog/faster-transformers), focusing on what makes the library faster today. All of these developments are only made possible thanks to these principles.
 <li class="tenet">
 <a id="source-of-truth"></a>
 <strong>Source of Truth</strong>
+<p>We aim be the [source of truth for all model definitions](https://huggingface.co/blog/transformers-model-definition). This is not a tenet, but something that guides our decisions. Model implementations should be reliable, reproducible, and faithful to the original performances.</p>
+<em>This overarching guideline ensures quality and reproducibility across all models in the library.</em>
 </li>
 <li class="tenet">
 <a id="one-model-one-file"></a>
 <strong>One Model, One File</strong>
 <p>All inference and training core logic has to be visible, top‑to‑bottom, to maximize each model's hackability.</p>
+<em>Every model should be understandable and hackable by reading a single file from top to bottom.</em>
 </li>
 <li class="tenet">
 <a id="code-is-product"></a>
+<strong>Code is the Product</strong>
+<p>Optimize for reading, diff-ing, and tweaking, our users are power users. Variables can be explicit, full words, even several words, readability is primordial.</p>
 <em>Code quality matters as much as functionality - optimize for human readers, not just computers.</em>
 </li>
 <li class="tenet">
 <a id="standardize-dont-abstract"></a>
 <strong>Standardize, Don't Abstract</strong>
+<p>If it's model behavior, keep it in the file; abstractions are only for generic infra.</p>
 <em>Model-specific logic belongs in the model file, not hidden behind abstractions.</em>
 </li>
 <li class="tenet">
 <li class="tenet">
 <a id="minimal-user-api"></a>
 <strong>Minimal User API</strong>
+<p>Config, model, pre-processing; from_pretrained, save_pretrained, push_to_hub. We want the least amount of codepaths. Reading should be obvious, configurations should be obvious.</p>
 <em>Keep the public interface simple and predictable, users should know what to expect.</em>
 </li>
 <li class="tenet">
 <a id="backwards-compatibility"></a>
 <strong>Backwards Compatibility</strong>
 <p>Evolve by additive standardization, never break public APIs.</p>
+<p>Any artifact that was once on the hub and worked with transformers should be usable indefinitely with the same interface. Further, public methods should not change to avoid breaking dependencies.</p>
 <em>Once something is public, it stays public, evolution through addition, not breaking changes.</em>
 </li>
 <li class="tenet">
 ```
+We want all models to have self-contained modeling code.
+Each core functionality _must_ be in the modeling code, every non-core functionality _can_ be outside of it.
+This comes as a great cost. Enter the `#Copied from...` mechanism: for a long time, these comments were indicating that some code was copied from another model, saving time both for the reviewers and for the CI. But the LOC count kept creeping up. Each new model copied over hundreds of lines that we considered largely boilerplate, yet, we could not remove them.
+We need to separate both principles that were so far intertwined, <Tenet term="do-repeat-yourself" display="repetition" position="top" /> and <Tenet term="one-model-one-file" display="hackabilty" position="top" />.
+What's the solution to this?
 <Note variant="info">
 <strong>TL;DR:</strong> Read the code in one place, <Tenet term="one-model-one-file" display="one model, one file." position="top" />. Keep semantics local (<a href="#standardize-dont-abstract">Standardize, Don't Abstract</a>). Allow strategic duplication for end users (<a href="#do-repeat-yourself">DRY*</a>). Keep the public surface minimal and stable (<a href="#minimal-user-api">Minimal API</a>, <a href="#backwards-compatibility">Backwards Compatibility</a>, <a href="#consistent-public-surface">Consistent Surface</a>).
 ### <a id="layers-attentions-caches"></a> Layers, attentions and caches
+Following the same logic, the _nature_ of attention and caching per layer of a model should not be hardcoded. We should be able to specify in a configuration-based fashion how each layer is implemented. Thus we define a mapping that can be then
 ```python
 ### Many models, but not enough yet, are alike
+Next, I looked into Jaccard similarity, which we use to measure set differences. I know that code is more than a set of characters stringed together. I also used code embedding models to check out code similarities, and it yielded better results, for the needs of this blog post I will stick to Jaccard index.
+It is interesting, for that, to look at _when_ we deployed this modular logic and what was its rippling effect on the library. You can check the [larger space](https://huggingface.co/spaces/Molbap/transformers-modular-refactor) to play around, but the gist is: adding modular allowed to connect more and more models to solid reference points. We have a lot of gaps to fill in still.
 Zoom out below - it's full of models. You can click on a node to see its connections better, or use the text box to search for a model. You can use the [full viewer](https://huggingface.co/spaces/Molbap/transformers-modular-refactor) (tab "timeline", hit "build timeline") for better exploration.
 Having a framework means forcing users into it. It restrains flexibility and creativity, which are the fertile soil for new ideas to grow.
+Among the most valuable contributions to `transformers` is of course the addition of new models. Recently, [OpenAI added GPT-OSS](https://huggingface.co/blog/welcome-openai-gpt-oss), which prompted the addition of many new features to the library in order to support [their model](https://huggingface.co/openai/gpt-oss-120b).
+A second one is the ability to fine-tune and pipeline these models into many other softwares. Check here on the hub how many finetunes are registered for [gpt-oss 120b](https://huggingface.co/models?other=base_model:finetune:openai/gpt-oss-120b), despite its size!
 <Note variant="info">