4 611 483

r PRO

oceansweep

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Are We on the Right Way to Assessing LLM-as-a-Judge?

liked a model 4 days ago

XiaomiMiMo/MiMo-V2-Flash

liked a model 6 days ago

YatharthS/MiraTTS

View all activity

Organizations

None yet

upvoted a paper 1 day ago

Are We on the Right Way to Assessing LLM-as-a-Judge?

Paper • 2512.16041 • Published 7 days ago • 29

upvoted 2 papers 7 days ago

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

Paper • 2512.12967 • Published 9 days ago • 98

Memory in the Age of AI Agents

Paper • 2512.13564 • Published 9 days ago • 107

upvoted 2 papers 20 days ago

In-Context Representation Hijacking

Paper • 2512.03771 • Published 21 days ago • 3

Qwen3-VL Technical Report

Paper • 2511.21631 • Published 28 days ago • 134

upvoted a paper 21 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published 22 days ago • 228

upvoted an article 22 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

24 days ago

•

252

upvoted a paper about 2 months ago

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5 • 125

upvoted a collection about 2 months ago

MDGA

Collection

Make Diffusion Great Again. The resource list for Super Data Learners, Quokka, and OpenMoE 2. • 16 items • Updated Nov 4 • 8

upvoted 4 papers 2 months ago

When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA

Paper • 2510.04849 • Published Oct 6 • 114

upvoted 7 papers 3 months ago

UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG

Paper • 2510.03663 • Published Oct 4 • 15

Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs

Paper • 2509.22582 • Published Sep 26 • 10

Learning to Reason for Hallucination Span Detection

Paper • 2510.02173 • Published Oct 2 • 18

F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data

Paper • 2510.02294 • Published Oct 2 • 44

Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks

Paper • 2510.02286 • Published Oct 2 • 28

The Rogue Scalpel: Activation Steering Compromises LLM Safety

Paper • 2509.22067 • Published Sep 26 • 27

CLUE: Non-parametric Verification from Experience via Hidden-State Clustering

Paper • 2510.01591 • Published Oct 2 • 27

r PRO

AI & ML interests

Recent Activity

Organizations

oceansweep's activity

Transformers v5: Simple model definitions powering the AI ecosystem