Load 4bit models 4x faster
					Collection
				
Native bitsandbytes 4bit pre quantized models
					β’ 
				25 items
				β’ 
				Updated
					
				β’
					
					59
Directly quantized 4bit model with bitsandbytes.
Original source: https://huggingface.co/alpindale/Mistral-7B-v0.2-hf/tree/main used to create the 4bit quantized versions.
We have a Google Colab Tesla T4 notebook for Mistral 7b v2 (32K context length) here: https://colab.research.google.com/drive/1Fa8QVleamfNELceNM9n7SeAGr_hT5XIn?usp=sharing
All notebooks are beginner friendly! Add your dataset, click "Run All", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face.
| Unsloth supports | Free Notebooks | Performance | Memory use | 
|---|---|---|---|
| Gemma 7b | βΆοΈ Start on Colab | 2.4x faster | 58% less | 
| Mistral 7b | βΆοΈ Start on Colab | 2.2x faster | 62% less | 
| Llama-2 7b | βΆοΈ Start on Colab | 2.2x faster | 43% less | 
| TinyLlama | βΆοΈ Start on Colab | 3.9x faster | 74% less | 
| CodeLlama 34b A100 | βΆοΈ Start on Colab | 1.9x faster | 27% less | 
| Mistral 7b 1xT4 | βΆοΈ Start on Kaggle | 5x faster* | 62% less | 
| DPO - Zephyr | βΆοΈ Start on Colab | 1.9x faster | 19% less |