Haoyu Guo's picture

47 3

Haoyu Guo

ghy0324

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 5 hours ago

Tongyi DeepResearch Technical Report

liked a model about 23 hours ago

deepseek-ai/DeepSeek-OCR

upvoted a paper 2 days ago

Reasoning with Sampling: Your Base Model is Smarter Than You Think

View all activity

Organizations

upvoted a paper about 5 hours ago

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published about 14 hours ago • 41

upvoted 4 papers 2 days ago

Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published 13 days ago • 41

Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation

Paper • 2510.21583 • Published 5 days ago • 30

Video-As-Prompt: Unified Semantic Control for Video Generation

Paper • 2510.20888 • Published 6 days ago • 41

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Paper • 2510.21618 • Published 5 days ago • 80

upvoted 2 papers 7 days ago

RL makes MLLMs see better than SFT

Paper • 2510.16333 • Published 11 days ago • 46

DeepSeek-OCR: Contexts Optical Compression

Paper • 2510.18234 • Published 8 days ago • 63

upvoted a paper 13 days ago

Generative Universal Verifier as Multimodal Meta-Reasoner

Paper • 2510.13804 • Published 14 days ago • 24

upvoted a paper 14 days ago

A Survey of Vibe Coding with Large Language Models

Paper • 2510.12399 • Published 15 days ago • 46

upvoted 4 papers 16 days ago

SpaceVista: All-Scale Visual Spatial Reasoning from mm to km

Paper • 2510.09606 • Published 19 days ago • 17

Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels

Paper • 2510.06499 • Published 21 days ago • 31

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Paper • 2510.09608 • Published 19 days ago • 49

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Paper • 2510.08673 • Published 20 days ago • 120

upvoted a paper 19 days ago

UniVideo: Unified Understanding, Generation, and Editing for Videos

Paper • 2510.08377 • Published 20 days ago • 67

upvoted 3 papers 20 days ago

Bridging Text and Video Generation: A Survey

Paper • 2510.04999 • Published 23 days ago • 3

SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models

Paper • 2510.06917 • Published 21 days ago • 34

Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding

Paper • 2510.06308 • Published 22 days ago • 52

upvoted 3 papers 22 days ago

It Takes Two: Your GRPO Is Secretly DPO

Paper • 2510.00977 • Published 28 days ago • 31

VChain: Chain-of-Visual-Thought for Reasoning in Video Generation

Paper • 2510.05094 • Published 23 days ago • 36

Paper2Video: Automatic Video Generation from Scientific Papers

Paper • 2510.05096 • Published 23 days ago • 107