EmbeddingGemma: Powerful and Lightweight Text Representations Paper โข 2509.20354 โข Published Sep 24 โข 39
EmbeddingGemma: Powerful and Lightweight Text Representations Paper โข 2509.20354 โข Published Sep 24 โข 39
Gemma 2: Improving Open Language Models at a Practical Size Paper โข 2408.00118 โข Published Jul 31, 2024 โข 79
EmbeddingGemma: Powerful and Lightweight Text Representations Paper โข 2509.20354 โข Published Sep 24 โข 39
EmbeddingGemma: Powerful and Lightweight Text Representations Paper โข 2509.20354 โข Published Sep 24 โข 39
view post Post 6234 We're kick-starting the process of Transformers v5, with @ArthurZ and @cyrilvallez !v5 should be significant: we're using it as a milestone for performance optimizations, saner defaults, and a much cleaner code base worthy of 2025.Fun fact: v4.0.0-rc-1 came out on Nov 19, 2020, nearly five years ago! See translation 6 replies ยท ๐ 17 17 ๐ 9 9 ๐ฅ 6 6 + Reply
view post Post 904 New post is live!This time we cover some major updates to transformers.๐ค See translation 1 reply ยท ๐ค 1 1 + Reply
SEA-HELM: Southeast Asian Holistic Evaluation of Language Models Paper โข 2502.14301 โข Published Feb 20 โข 2
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper โข 2503.07920 โข Published Mar 10 โข 101
Language Surgery in Multilingual Large Language Models Paper โข 2506.12450 โข Published Jun 14 โข 16
Mangosteen: An Open Thai Corpus for Language Model Pretraining Paper โข 2507.14664 โข Published Jul 19 โข 7
WangchanThaiInstruct: An instruction-following Dataset for Culture-Aware, Multitask, and Multi-domain Evaluation in Thai Paper โข 2508.15239 โข Published Aug 21
view post Post 5810 Run DeepSeek-V3.1 locally on 170GB RAM with Dynamic 1-bit GGUFs!๐GGUFs: unsloth/DeepSeek-V3.1-GGUFThe 715GB model gets reduced to 170GB (-80% size) by smartly quantizing layers.The 1-bit GGUF passes all our code tests & we fixed the chat template for llama.cpp supported backends.Guide: https://docs.unsloth.ai/basics/deepseek-v3.1 See translation โค๏ธ 18 18 ๐ฅ 9 9 ๐ 5 5 + Reply
view post Post 5101 Run OpenAI's new gpt-oss models locally with Unsloth GGUFs! ๐ฅ๐ฆฅ20b GGUF: unsloth/gpt-oss-20b-GGUF120b GGUF: unsloth/gpt-oss-120b-GGUFModel will run on 14GB RAM for 20b and 66GB for 120b. See translation 2 replies ยท โค๏ธ 20 20 ๐ฅ 6 6 ๐ 5 5 + Reply
view post Post 3316 It's Qwen3 week! ๐ We uploaded Dynamic 2-bit GGUFs for:Qwen3-Coder: unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUFQwen3-2507: unsloth/Qwen3-235B-A22B-Instruct-2507-GGUFSo you can run them both locally!Guides are in model cards. See translation 1 reply ยท ๐ค 5 5 โค๏ธ 4 4 ๐ฅ 3 3 + Reply
view post Post 866 I have always advocated for writing techinical stories without using LLMs.The following one page editorial really drives the point home.https://www.nature.com/articles/s44222-025-00323-4 See translation ๐ง 3 3 + Reply
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities Paper โข 2507.06261 โข Published Jul 7 โข 63