DeepSeek R1 (All Versions) Collection DeepSeek-R1-0528 is here! The most powerful reasoning open LLM, available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 37 items • Updated 4 days ago • 260
RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text Paper • 2305.13304 • Published May 22, 2023 • 2
cyberagent/Llama-3.1-70B-Japanese-Instruct-2407 Text Generation • 71B • Updated Jul 26, 2024 • 404 • • 77
view article Article Estimating Memory Consumption of LLMs for Inference and Fine-Tuning for Cohere Command-R+ Apr 26, 2024 • 13