updating titles and intro
Browse files- content/article.md +11 -3
- webpack.config.js +6 -6
content/article.md
CHANGED
|
@@ -1,11 +1,19 @@
|
|
| 1 |
|
| 2 |
## Introduction
|
| 3 |
|
| 4 |
-
The `transformers` library, built with `PyTorch`, supports all state-of-the-art LLMs, many VLMs, task-specific vision language models, video models, audio models, table models, classical encoders, to a global count of almost 400 models.
|
| 5 |
|
| 6 |
-
The
|
| 7 |
|
| 8 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
### What you will learn
|
| 11 |
|
|
|
|
| 1 |
|
| 2 |
## Introduction
|
| 3 |
|
| 4 |
+
The `transformers` library, built with `PyTorch`, supports all state-of-the-art LLMs, many VLMs, task-specific vision language models, video models, audio models, table models, classical encoders, to a global count of almost 400 models.
|
| 5 |
|
| 6 |
+
The name of the library itself is mostly majority driven as many models are not even transformers architectures, like Mamba, Zamba, RWKV, and convolution-based models.
|
| 7 |
|
| 8 |
+
Regardless, each of these is wrought by the research and engineering team that created them, then harmonized into a now famous interface, and callable with a simple `.from_pretrained` command.
|
| 9 |
+
|
| 10 |
+
Inference works for all models, training is functional for most. The library is a foundation for many machine learning courses, cookbooks, and overall, several thousands other open-source libraries depend on it. All models are tested as part of a daily CI ensuring their preservation and reproducibility. Most importantly, it is _open-source_ and has been written by the community for a large part.
|
| 11 |
+
|
| 12 |
+
This isn't really to brag but to set the stakes: what does it take to keep such a ship afloat, made of so many moving, unrelated parts?
|
| 13 |
+
|
| 14 |
+
The ML wave has not stopped, there's more and more models being added, at a steadily growing rate. `Transformers` is widely used, and we read the feedback that users post online. Whether it's about a function that had 300+ keyword arguments, duplicated code and helpers, and mentions of `Copied from ... ` everywhere, along with optimisation concerns. Text-only models are relatively tamed, but multimodal models remain to be harmonized.
|
| 15 |
+
|
| 16 |
+
Here we will dissect what is the new design philosophy of transformers, as a continuation from the existing older [philosophy](https://huggingface.co/docs/transformers/en/philosophy) page, and an accompanying [blog post from 2022](https://huggingface.co/blog/transformers-design-philosophy) . Some time ago I dare not say how long, we discussed with transformers maintainers about the state of things. A lot of recent developments were satisfactory, but if we were only talking about these, self-congratulation would be the only goalpost. Reflecting on this philosophy now, as models pile up, is essential and will drive new developments.
|
| 17 |
|
| 18 |
### What you will learn
|
| 19 |
|
webpack.config.js
CHANGED
|
@@ -139,7 +139,7 @@ module.exports = {
|
|
| 139 |
|
| 140 |
// Extract tenet text for tooltips
|
| 141 |
const tenetTooltips = {
|
| 142 |
-
'source-of-truth': 'We
|
| 143 |
'one-model-one-file': 'All inference (and most of training, loss is separate, not a part of model) logic visible, top‑to‑bottom.',
|
| 144 |
'code-is-product': 'Optimize for reading, diffing, and tweaking, our users are power users. Variables can be explicit, full words, even several words, readability is primordial.',
|
| 145 |
'standardize-dont-abstract': 'If it\\'s model behavior, keep it in the file; abstractions only for generic infra.',
|
|
@@ -225,22 +225,22 @@ module.exports = {
|
|
| 225 |
<script src="https://d3js.org/d3.v7.min.js"></script>
|
| 226 |
<meta name="viewport" content="width=device-width, initial-scale=1">
|
| 227 |
<meta charset="utf8">
|
| 228 |
-
<title>
|
| 229 |
<link rel="stylesheet" href="style.css">
|
| 230 |
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css">
|
| 231 |
</head>
|
| 232 |
<body>
|
| 233 |
<d-front-matter>
|
| 234 |
<script id='distill-front-matter' type="text/json">{
|
| 235 |
-
"title": "
|
| 236 |
-
"description": "
|
| 237 |
"published": "Aug 21, 2025",
|
| 238 |
"authors": [{"author": "Pablo Montalvo", "authorURL": "https://huggingface.co/Molbap"}]
|
| 239 |
}</script>
|
| 240 |
</d-front-matter>
|
| 241 |
<d-title>
|
| 242 |
-
<h1>
|
| 243 |
-
<p>
|
| 244 |
</d-title>
|
| 245 |
<d-byline></d-byline>
|
| 246 |
<d-article>
|
|
|
|
| 139 |
|
| 140 |
// Extract tenet text for tooltips
|
| 141 |
const tenetTooltips = {
|
| 142 |
+
'source-of-truth': 'We aim to be a source of truth for all model definitions. Model implementations should be reliable, reproducible, and faithful to the original performances.',
|
| 143 |
'one-model-one-file': 'All inference (and most of training, loss is separate, not a part of model) logic visible, top‑to‑bottom.',
|
| 144 |
'code-is-product': 'Optimize for reading, diffing, and tweaking, our users are power users. Variables can be explicit, full words, even several words, readability is primordial.',
|
| 145 |
'standardize-dont-abstract': 'If it\\'s model behavior, keep it in the file; abstractions only for generic infra.',
|
|
|
|
| 225 |
<script src="https://d3js.org/d3.v7.min.js"></script>
|
| 226 |
<meta name="viewport" content="width=device-width, initial-scale=1">
|
| 227 |
<meta charset="utf8">
|
| 228 |
+
<title>Scaling insanity: maintaining hundreds of model definitions</title>
|
| 229 |
<link rel="stylesheet" href="style.css">
|
| 230 |
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css">
|
| 231 |
</head>
|
| 232 |
<body>
|
| 233 |
<d-front-matter>
|
| 234 |
<script id='distill-front-matter' type="text/json">{
|
| 235 |
+
"title": "Scaling insanity: maintaining hundreds of model definitions",
|
| 236 |
+
"description": "A peek into software engineering for the transformers library",
|
| 237 |
"published": "Aug 21, 2025",
|
| 238 |
"authors": [{"author": "Pablo Montalvo", "authorURL": "https://huggingface.co/Molbap"}]
|
| 239 |
}</script>
|
| 240 |
</d-front-matter>
|
| 241 |
<d-title>
|
| 242 |
+
<h1>Scaling insanity: maintaining hundreds of model definitions</h1>
|
| 243 |
+
<p>A peek into software engineering for the transformers library</p>
|
| 244 |
</d-title>
|
| 245 |
<d-byline></d-byline>
|
| 246 |
<d-article>
|