Glad to hear thanks for reading! :)
Daniel (Unsloth) PRO
danielhanchen
AI & ML interests
None yet
Recent Activity
updated
a model
23 minutes ago
unsloth/GLM-4.6-GGUF
new activity
about 5 hours ago
unsloth/GLM-4.6V-Flash-GGUF:Vision now supported! ποΈ
liked
a model
about 5 hours ago
unsloth/Nemotron-3-Nano-30B-A3B-GGUF
Organizations
replied to
their
post
about 9 hours ago
posted
an
update
1 day ago
Post
1072
You can now run GLM-4.7, the new 355B parameter SOTA model on your local device (128GB RAM).β¨
The model achieves SOTA performance on coding, agentic and chat benchmarks.
GGUF: unsloth/GLM-4.7-GGUF
Guide: https://docs.unsloth.ai/models/glm-4.7
The model achieves SOTA performance on coding, agentic and chat benchmarks.
GGUF: unsloth/GLM-4.7-GGUF
Guide: https://docs.unsloth.ai/models/glm-4.7
posted
an
update
6 days ago
Post
2093
Google releases FunctionGemma, a new 270M parameter model that runs on just 0.5 GB RAM.β¨
Built for tool-calling, run locally on your phone at 50+ tokens/s, or fine-tune with Unsloth & deploy to your phone.
GGUF: unsloth/functiongemma-270m-it-GGUF
Docs + Notebook: https://docs.unsloth.ai/models/functiongemma
Built for tool-calling, run locally on your phone at 50+ tokens/s, or fine-tune with Unsloth & deploy to your phone.
GGUF: unsloth/functiongemma-270m-it-GGUF
Docs + Notebook: https://docs.unsloth.ai/models/functiongemma
Post
5157
NVIDIA releases Nemotron 3 Nano, a new 30B hybrid reasoning model! π₯
Has 1M context window & best in class performance for SWE-Bench, reasoning & chat. Run the MoE model locally with 24GB RAM.
GGUF: unsloth/Nemotron-3-Nano-30B-A3B-GGUF
π Step-by-step Guide: https://docs.unsloth.ai/models/nemotron-3
Has 1M context window & best in class performance for SWE-Bench, reasoning & chat. Run the MoE model locally with 24GB RAM.
GGUF: unsloth/Nemotron-3-Nano-30B-A3B-GGUF
π Step-by-step Guide: https://docs.unsloth.ai/models/nemotron-3
posted
an
update
9 days ago
Post
5157
NVIDIA releases Nemotron 3 Nano, a new 30B hybrid reasoning model! π₯
Has 1M context window & best in class performance for SWE-Bench, reasoning & chat. Run the MoE model locally with 24GB RAM.
GGUF: unsloth/Nemotron-3-Nano-30B-A3B-GGUF
π Step-by-step Guide: https://docs.unsloth.ai/models/nemotron-3
Has 1M context window & best in class performance for SWE-Bench, reasoning & chat. Run the MoE model locally with 24GB RAM.
GGUF: unsloth/Nemotron-3-Nano-30B-A3B-GGUF
π Step-by-step Guide: https://docs.unsloth.ai/models/nemotron-3
posted
an
update
13 days ago
Post
1936
Mistral's new SOTA coding models Devstral 2 can now be Run locally! (25GB RAM) π±
We fixed the chat template, so performance should be much better now!
24B: unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF
123B: unsloth/Devstral-2-123B-Instruct-2512-GGUF
π§‘Step-by-step Guide: https://docs.unsloth.ai/models/devstral-2
We fixed the chat template, so performance should be much better now!
24B: unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF
123B: unsloth/Devstral-2-123B-Instruct-2512-GGUF
π§‘Step-by-step Guide: https://docs.unsloth.ai/models/devstral-2
replied to
their
post
19 days ago
You need to update to the latest llama.cpp version
posted
an
update
21 days ago
Post
3591
Mistral's new Ministral 3 models can now be Run & Fine-tuned locally! (16GB RAM)
Ministral 3 have vision support and the best-in-class performance for their sizes.
14B Instruct GGUF: unsloth/Ministral-3-14B-Instruct-2512-GGUF
14B Reasoning GGUF: unsloth/Ministral-3-14B-Reasoning-2512-GGUF
π± Step-by-step Guide: https://docs.unsloth.ai/new/ministral-3
All GGUFs, BnB, FP8 etc. variants uploads: https://huggingface.co/collections/unsloth/ministral-3
Ministral 3 have vision support and the best-in-class performance for their sizes.
14B Instruct GGUF: unsloth/Ministral-3-14B-Instruct-2512-GGUF
14B Reasoning GGUF: unsloth/Ministral-3-14B-Reasoning-2512-GGUF
π± Step-by-step Guide: https://docs.unsloth.ai/new/ministral-3
All GGUFs, BnB, FP8 etc. variants uploads: https://huggingface.co/collections/unsloth/ministral-3
posted
an
update
26 days ago
Post
8419
Qwen3-Next can now be Run locally! (30GB RAM)
Instruct GGUF: unsloth/Qwen3-Next-80B-A3B-Instruct-GGUF
The models come in Thinking and Instruct versions and utilize a new architecture, allowing it to have ~10x faster inference than Qwen32B.
π Step-by-step Guide: https://docs.unsloth.ai/models/qwen3-next
Thinking GGUF: unsloth/Qwen3-Next-80B-A3B-Thinking-GGUF
Instruct GGUF: unsloth/Qwen3-Next-80B-A3B-Instruct-GGUF
The models come in Thinking and Instruct versions and utilize a new architecture, allowing it to have ~10x faster inference than Qwen32B.
π Step-by-step Guide: https://docs.unsloth.ai/models/qwen3-next
Thinking GGUF: unsloth/Qwen3-Next-80B-A3B-Thinking-GGUF
posted
an
update
about 2 months ago
Post
4335
You can now run Kimi K2 Thinking locally with our Dynamic 1-bit GGUFs:
unsloth/Kimi-K2-Thinking-GGUF
We shrank the 1T model to 245GB (-62%) & retained ~85% of accuracy on Aider Polyglot. Run on >247GB RAM for fast inference.
We also collaborated with the Moonshot AI Kimi team on a system prompt fix! π₯°
Guide + fix details: https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally
We shrank the 1T model to 245GB (-62%) & retained ~85% of accuracy on Aider Polyglot. Run on >247GB RAM for fast inference.
We also collaborated with the Moonshot AI Kimi team on a system prompt fix! π₯°
Guide + fix details: https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally
posted
an
update
4 months ago
Post
6551
Run DeepSeek-V3.1 locally on 170GB RAM with Dynamic 1-bit GGUFs!π
GGUFs: unsloth/DeepSeek-V3.1-GGUF
The 715GB model gets reduced to 170GB (-80% size) by smartly quantizing layers.
The 1-bit GGUF passes all our code tests & we fixed the chat template for llama.cpp supported backends.
Guide: https://docs.unsloth.ai/basics/deepseek-v3.1
GGUFs: unsloth/DeepSeek-V3.1-GGUF
The 715GB model gets reduced to 170GB (-80% size) by smartly quantizing layers.
The 1-bit GGUF passes all our code tests & we fixed the chat template for llama.cpp supported backends.
Guide: https://docs.unsloth.ai/basics/deepseek-v3.1
posted
an
update
5 months ago
Post
5619
Run OpenAI's new gpt-oss models locally with Unsloth GGUFs! π₯π¦₯
20b GGUF: unsloth/gpt-oss-20b-GGUF
120b GGUF: unsloth/gpt-oss-120b-GGUF
Model will run on 14GB RAM for 20b and 66GB for 120b.
20b GGUF: unsloth/gpt-oss-20b-GGUF
120b GGUF: unsloth/gpt-oss-120b-GGUF
Model will run on 14GB RAM for 20b and 66GB for 120b.
posted
an
update
5 months ago
Post
3725
It's Qwen3 week! π We uploaded Dynamic 2-bit GGUFs for:
Qwen3-Coder: unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF
Qwen3-2507: unsloth/Qwen3-235B-A22B-Instruct-2507-GGUF
So you can run them both locally!
Guides are in model cards.
Qwen3-Coder: unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF
Qwen3-2507: unsloth/Qwen3-235B-A22B-Instruct-2507-GGUF
So you can run them both locally!
Guides are in model cards.
posted
an
update
6 months ago
Post
4009
We fixed more issues! Use
* Fixed Nanonets OCR-s unsloth/Nanonets-OCR-s-GGUF
* Fixed THUDM GLM-4 unsloth/GLM-4-32B-0414-GGUF
* DeepSeek Chimera v2 is uploading! unsloth/DeepSeek-TNG-R1T2-Chimera-GGUF
--jinja for all!* Fixed Nanonets OCR-s unsloth/Nanonets-OCR-s-GGUF
* Fixed THUDM GLM-4 unsloth/GLM-4-32B-0414-GGUF
* DeepSeek Chimera v2 is uploading! unsloth/DeepSeek-TNG-R1T2-Chimera-GGUF
replied to
their
post
6 months ago
Thank you!
posted
an
update
6 months ago
Post
3208
Gemma 3n finetuning is now 1.5x faster and uses 50% less VRAM in Unsloth!
Click "Use this model" and click "Google Colab"!
unsloth/gemma-3n-E4B-it
unsloth/gemma-3n-E2B-it
https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3N_(4B)-Conversational.ipynb
Click "Use this model" and click "Google Colab"!
unsloth/gemma-3n-E4B-it
unsloth/gemma-3n-E2B-it
https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3N_(4B)-Conversational.ipynb
posted
an
update
6 months ago
Post
1335
We updated lots of our GGUFs and uploaded many new ones!
* unsloth/dots.llm1.inst-GGUF
* unsloth/Jan-nano-GGUF
* unsloth/Nanonets-OCR-s-GGUF
* Updated and fixed Q8_0 upload for unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF
* Added Q2_K_XL for unsloth/DeepSeek-R1-0528-GGUF
* Updated and fixed Vision support for unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
* unsloth/dots.llm1.inst-GGUF
* unsloth/Jan-nano-GGUF
* unsloth/Nanonets-OCR-s-GGUF
* Updated and fixed Q8_0 upload for unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF
* Added Q2_K_XL for unsloth/DeepSeek-R1-0528-GGUF
* Updated and fixed Vision support for unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
posted
an
update
7 months ago
Post
2535
Mistral releases Magistral, their new reasoning models! π₯
GGUFs to run: unsloth/Magistral-Small-2506-GGUF
Magistral-Small-2506 excels at mathematics and coding.
You can run the 24B model locally with just 32GB RAM by using our Dynamic GGUFs.
GGUFs to run: unsloth/Magistral-Small-2506-GGUF
Magistral-Small-2506 excels at mathematics and coding.
You can run the 24B model locally with just 32GB RAM by using our Dynamic GGUFs.