In a Training Loop 🔄

5 48 103

Arthur EDMOND

Shumatsurontek

AI & ML interests

LLM & Computer Vision

Recent Activity

updated a model 9 days ago

Shumatsurontek/Qwen3-8B-SFT

liked a Space 9 days ago

Shumatsurontek/Qwen3-8B-SFT

updated a Space 9 days ago

Shumatsurontek/Qwen3-8B-SFT

View all activity

Organizations

upvoted a paper 19 days ago

Kling-Omni Technical Report

Paper • 2512.16776 • Published 20 days ago • 164

upvoted an article 23 days ago

Article

Gotchas in Tokenizer Behavior Every Developer Should Know

Apr 18, 2025

•

upvoted an article about 1 month ago

Article

Welcome EmbeddingGemma, Google's new efficient embedding model

Sep 4, 2025

•

267

upvoted a paper about 1 month ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23, 2025 • 282

upvoted 2 papers about 2 months ago

MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

Paper • 2511.11793 • Published Nov 14, 2025 • 165

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 132

upvoted 2 papers 2 months ago

Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation

Paper • 2510.22115 • Published Oct 25, 2025 • 83

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Paper • 2510.21618 • Published Oct 24, 2025 • 99

upvoted 2 papers 3 months ago

Paper2Video: Automatic Video Generation from Scientific Papers

Paper • 2510.05096 • Published Oct 6, 2025 • 118

Apriel-1.5-15b-Thinker

Paper • 2510.01141 • Published Oct 1, 2025 • 119

upvoted a collection 3 months ago

Granite 4.0 Language Models

Collection

13 items • Updated Nov 17, 2025 • 199

upvoted 3 papers 3 months ago

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29, 2025 • 141

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Paper • 2509.25760 • Published Sep 30, 2025 • 55

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

Paper • 2509.22576 • Published Sep 26, 2025 • 134

upvoted 3 papers 4 months ago

upvoted an article 4 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8, 2025

•

743

upvoted 2 papers 4 months ago

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4, 2025 • 195

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published Aug 31, 2025 • 84

Arthur EDMOND

AI & ML interests

Recent Activity

Organizations

Shumatsurontek's activity

Gotchas in Tokenizer Behavior Every Developer Should Know

Welcome EmbeddingGemma, Google's new efficient embedding model

SmolLM3: smol, multilingual, long-context reasoner