Spaces:

Pouyae
/

open-source-slms

Running

App Files Files Community

open-source-slms / index.html

Pouyae

Update index

ff179c5 6 months ago

raw

history blame contribute delete

8.58 kB

	<!DOCTYPE html>
	<html lang="en">
	<head>
	<meta charset="UTF-8">
	<meta name="viewport" content="width=device-width, initial-scale=1">
	<title>Top Open-Source Small Language Models</title>
	<link rel="stylesheet" href="styles.css"/>
	</head>
	<body>

	<h1>Top Open-Source Small Language Models for Generative AI Applications</h1>

	<p>
	Small Language Models (SLMs) are language models that contain, at most, a few billion parameters—significantly fewer
	than Large Language Models (LLMs), which can have tens, hundreds of billions, or even trillions, of parameters. SLMs
	are well-suited for resource-constrained environments, as well as on-device and real-time generative AI
	applications. Many of them can run locally on a laptop using tools like LM Studio or Ollama . These models are
	typically derived from larger models using techniques such as quantization and distillation. In the following, some
	well developed SLMs are introduced.
	</p>
	<p>
	Note: All the models mentioned here are open source. However, for details regarding experimental use, commercial
	use, redistribution, and other terms, please refer to the license documentation.
	</p>

	<h2>Phi 4 Collection by Microsoft</h2>
	<p>
	This Collection features a range of small language models, including reasoning models, ONNX- and GGUF-compatible
	formats, and multimodal models. The base model in the collection has 14 billion parameters, while the smallest
	models have 3.84 billion. Strategic use of synthetic data during training has led to improved performance compared
	to its mother model (primarily GPT-4). Currently, the collection includes three versions of reasoning-focused SLMs,
	making it one of the best solutions for reasoning tasks.
	</p>
	<p>
	👉 Licence: <a href="https://choosealicense.com/licenses/mit/" target="_blank">MIT</a><br>
	👉 <a href="https://huggingface.co/collections/microsoft/phi-4-677e9380e514feb5577a40e4" target="_blank">Collection
	on Hugging Face</a><br>
	👉 <a href="https://arxiv.org/abs/2412.08905" target="_blank">Technical Report</a>
	</p>

	<h2>Gemma 3 Collection by Google</h2>
	<p>
	This collection features multiple versions, including Image-to-Text, Text-to-Text, and Image-and-Text-to-Text
	models, available in both quantized and GGUF formats. The models vary in size, with 1, 4.3, 12.2, and 27.4 billion
	parameters. Two specialized variants have been developed for specific applications: TxGemma, optimized for
	therapeutic development, and ShieldGemma, designed for moderating text and image content.
	</p>
	<p>
	👉 Licence: <a href="https://ai.google.dev/gemma/terms" target="_blank">Gemma</a><br>
	👉 <a href="https://huggingface.co/collections/google/gemma-3-release-67c6c6f89c4f76621268bb6d" target="_blank">Collection
	on Hugging Face</a><br>
	👉 <a href="https://storage.googleapis.com/deepmind-media/gemma/Gemma3Report.pdf" target="_blank">Technical
	Report</a><br>
	👉 <a href="https://huggingface.co/collections/google/shieldgemma-67d130ef8da6af884072a789" target="_blank">ShieldGemma
	on Hugging Face</a><br>
	👉 <a href="https://huggingface.co/collections/google/txgemma-release-67dd92e931c857d15e4d1e87" target="_blank">TxGemma
	on Hugging Face</a>
	</p>

	<h2>Mistral Models</h2>
	<p>
	Mistral AI is a France-based AI startup and one of the pioneers in releasing open-source language models. Its
	current product lineup includes three compact models: Mistral Small 3.1, Pixtral 12B, and Mistral NEMO. All of them
	are released under <a href="https://www.apache.org/licenses/LICENSE-2.0" target="_blank">Apache 2.0 license</a>.
	</p>

	<p>
	<b>Mistral 3.1</b> is a multimodal and multilingual SLM having 24 billion parameters and 128K context window.
	Currently there are two versions: Base and Instruct.<br>
	👉 <a href="https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Base-2503" target="_blank">Base Version on Hugging
	Face</a><br>
	👉 <a href="https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503" target="_blank">Instruct Version on
	Hugging Face</a><br>
	👉 <a href="https://mistral.ai/news/mistral-small-3-1" target="_blank">Technical Report</a><br>
	</p>

	<p>
	<b>Pixtral 12B</b> is a natively multimodal model trained on interleaved image and text data, delivering strong
	performance on multimodal tasks and instruction following while maintaining state-of-the-art results on text-only
	benchmarks. It features a newly developed 400M parameter vision encoder and a 12B parameter multimodal decoder based
	on Mistral NEMO. The model supports variable image sizes, aspect ratios, and multiple images within a long context
	window of up to 128k tokens.<br>
	👉 <a href="https://huggingface.co/mistralai/Pixtral-12B-Base-2409" target="_blank">Pixtral-12B-Base-2409 on Hugging
	Face</a><br>
	👉 <a href="https://huggingface.co/mistralai/Pixtral-12B-2409" target="_blank">Pixtral-12B-2409 on Hugging
	Face</a><br>
	👉 <a href="https://mistral.ai/news/pixtral-12b" target="_blank">Technical Report</a><br>
	</p>

	<p>
	<b>Mistral NeMo</b> is a 12B model developed in collaboration with NVIDIA, featuring a large 128k-token context
	window and state-of-the-art reasoning, knowledge, and coding accuracy for its size.<br>
	👉 <a href="https://huggingface.co/mistralai/Mistral-Nemo-Instruct-FP8-2407" target="_blank">Model on Hugging
	Face</a><br>
	👉 <a href="https://mistral.ai/news/mistral-nemo" target="_blank">Technical Report</a>
	</p>

	<h2>Llama Models by Meta</h2>
	<p>
	Meta is one of the leading contributors to open-source AI. In recent years, it has released several versions of its
	Llama models. The latest series is Llama 4, although all models in this collection are currently quite large.
	Smaller models may be introduced in the future or in upcoming sub-versions of Llama 4, but for now, that hasn’t
	happened. The most recent collection that includes smaller models is Llama 3.2. It features models with 1.24 billion
	and 3.21 billion parameters with 128k context windows. Additionally, there is a 10.6 billion-parameter multimodal
	version designed for Image-and-Text-to-Text tasks.
	This collection includes small variants of Llama Guard — fine-tuned language models designed for prompt and response
	classification. They can detect unsafe prompts and responses, making them useful for implementing safety measures in
	LLM-based applications.
	</p>
	<p>
	👉 License: <a href="https://www.llama.com/llama3_2/license/" target="_blank">LLAMA 3.2 COMMUNITY LICENSE
	AGREEMENT</a><br>
	👉 <a href="https://huggingface.co/collections/meta-llama/llama-32-66f448ffc8c32f949b04c8cf" target="_blank">Collection
	on Hugging Face</a><br>
	👉 <a href="https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/" target="_blank">Technical
	Paper</a>
	</p>

	<h2>Qwen 3 Collection by Alibaba</h2>
	<p>
	The Chinese tech giant Alibaba is another major player in open-source AI. It releases its language models under the
	Qwen name. The latest version is Qwen 3, which includes both small and large models. The smaller models range in
	size, with parameter counts of 14.8 billion, 8.19 billion, 4.02 billion, 2.03 billion, and even 752 million. This
	collection also includes quantized and GGUF formats.
	</p>
	<p>
	👉 Licence: <a href="https://www.apache.org/licenses/LICENSE-2.0" target="_blank">Apache 2.0</a><br>
	👉 <a href="https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f" target="_blank">Collection on
	Hugging Face</a><br>
	👉 <a href="https://github.com/QwenLM/Qwen3/blob/main/Qwen3_Technical_Report.pdf" target="_blank">Technical
	Report</a>
	</p>

	<hr style="border: none; height: 1px; background-color: #ccc;">

	<p>This list is not limited to these five. You can explore more open-source models at:</p>
	<ul>
	<li><a href="https://huggingface.co/databricks" target="_blank">Databricks</a></li>
	<li><a href="https://huggingface.co/Cohere" target="_blank">Cohere</a></li>
	<li><a href="https://huggingface.co/deepseek-ai" target="_blank">Deepseek</a></li>
	<li><a href="https://huggingface.co/collections/HuggingFaceTB/smollm-6695016cad7167254ce15966" target="_blank">SmolLM</a>
	</li>
	<li><a href="https://huggingface.co/stabilityai" target="_blank">Stability AI</a></li>
	<li><a href="https://huggingface.co/ibm-granite" target="_blank">IBM Granite</a></li>
	</ul>

	</body>
	</html>