Mistral Small 3 (All Versions) Collection A collection of Mistral's new Small 3.2 and 3 models including GGUF, 4-bit and more! • 20 items • Updated 27 days ago • 17
Brainstorm Adapter Models - Augmented/Expanded Reasoning Collection Adapters by DavidAU: Splits apart the reasoning center(s) and multiples them 3x, 4x, 8x, 10x, 20x, 40x+. Creativity+ / Logic+ / Detail+ / Prose+ ... • 188 items • Updated about 19 hours ago • 21
Tiny Language Model Datasets Collection Collection of Synthetic Datasets that can be used in pretraining of any the Tiny Language Model • 14 items • Updated Sep 21 • 29
ERNIE 4.5 Collection collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 26 items • Updated Sep 24 • 174
Falcon Edge series Collection A series of powerful, universal and fine-tunable small Language Models • 7 items • Updated Jul 23 • 24
Describe Anything Collection Multimodal Large Language Models for Detailed Localized Image and Video Captioning • 7 items • Updated 7 days ago • 58
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 28 items • Updated 27 days ago • 87
Breeze 2 Family Collection Llama-Breeze2 is a multi-modal language model family specifically intended for Traditional Chinese use. BreezyVoice is a Taiwan Mandarin TTS • 6 items • Updated Feb 26 • 19
The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling Capabilities Paper • 2501.13921 • Published Jan 23 • 3
R1 Reproduction Works 🤔 Collection Open-source works to reproduce DeepSeek R1 • 52 items • Updated May 15 • 8
YandAI Series Collection May you not be burned away by the flames of love of Ἀφροδίτη. This is the latest model of yandere fine-tuning. • 0 items • Updated Dec 18, 2024 • 1
Cosmos-Tokenizer Collection A suite of image and video tokenizers • 13 items • Updated 7 days ago • 41
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models Paper • 2407.12327 • Published Jul 17, 2024 • 79