gg-hf-g

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

ariG23498 authored a paper 7 days ago

FineVision: Open Data Is All You Need

gusthema authored a paper about 1 month ago

EmbeddingGemma: Powerful and Lightweight Text Representations

gusthema authored a paper about 1 month ago

Gemma 2: Improving Open Language Models at a Practical Size

View all activity

ariG23498

authored a paper 7 days ago

FineVision: Open Data Is All You Need

Paper • 2510.17269 • Published 8 days ago • 57

michellecasbon

authored a paper 25 days ago

EmbeddingGemma: Powerful and Lightweight Text Representations

Paper • 2509.20354 • Published Sep 24 • 39

gusthema

authored 3 papers about 1 month ago

bebechien

authored a paper about 1 month ago

EmbeddingGemma: Powerful and Lightweight Text Representations

Paper • 2509.20354 • Published Sep 24 • 39

osanseviero

authored a paper about 1 month ago

EmbeddingGemma: Powerful and Lightweight Text Representations

Paper • 2509.20354 • Published Sep 24 • 39

lysandre

posted an update about 1 month ago

Post

6234

We're kick-starting the process of Transformers v5, with @ArthurZ and @cyrilvallez !

v5 should be significant: we're using it as a milestone for performance optimizations, saner defaults, and a much cleaner code base worthy of 2025.

Fun fact: v4.0.0-rc-1 came out on Nov 19, 2020, nearly five years ago!

6 replies

ariG23498

posted an update about 2 months ago

Post

904

New post is live!

This time we cover some major updates to transformers.

🤗

1 reply

mrpeerat

authored 6 papers 2 months ago

SEA-HELM: Southeast Asian Holistic Evaluation of Language Models

Paper • 2502.14301 • Published Feb 20 • 2

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

Paper • 2503.07920 • Published Mar 10 • 101

SEA-LION: Southeast Asian Languages in One Network

Paper • 2504.05747 • Published Apr 8

Language Surgery in Multilingual Large Language Models

Paper • 2506.12450 • Published Jun 14 • 16

Mangosteen: An Open Thai Corpus for Language Model Pretraining

Paper • 2507.14664 • Published Jul 19 • 7

WangchanThaiInstruct: An instruction-following Dataset for Culture-Aware, Multitask, and Multi-domain Evaluation in Thai

Paper • 2508.15239 • Published Aug 21

danielhanchen

posted an update 2 months ago

Post

5810

Run DeepSeek-V3.1 locally on 170GB RAM with Dynamic 1-bit GGUFs!🐋
GGUFs: unsloth/DeepSeek-V3.1-GGUF

The 715GB model gets reduced to 170GB (-80% size) by smartly quantizing layers.

The 1-bit GGUF passes all our code tests & we fixed the chat template for llama.cpp supported backends.

Guide: https://docs.unsloth.ai/basics/deepseek-v3.1

danielhanchen

posted an update 3 months ago

Post

5101

Run OpenAI's new gpt-oss models locally with Unsloth GGUFs! 🔥🦥
20b GGUF: unsloth/gpt-oss-20b-GGUF
120b GGUF: unsloth/gpt-oss-120b-GGUF

Model will run on 14GB RAM for 20b and 66GB for 120b.

2 replies

danielhanchen

posted an update 3 months ago

Post

3316

It's Qwen3 week! 💜 We uploaded Dynamic 2-bit GGUFs for:

Qwen3-Coder: unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF
Qwen3-2507: unsloth/Qwen3-235B-A22B-Instruct-2507-GGUF

So you can run them both locally!
Guides are in model cards.

1 reply

ariG23498

posted an update 3 months ago

Post

866

I have always advocated for writing techinical stories without using LLMs.

The following one page editorial really drives the point home.
https://www.nature.com/articles/s44222-025-00323-4

eyvinec

authored a paper 3 months ago

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7 • 63

AI & ML interests

Recent Activity

Team members 138

gg-hf-g's activity