MolmoAct: Action Reasoning Models that can Reason in Space Paper • 2508.07917 • Published Aug 11 • 43
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation Paper • 2502.14846 • Published Feb 20 • 14
Articulate-Anything: Automatic Modeling of Articulated Objects via a Vision-Language Foundation Model Paper • 2410.13882 • Published Oct 3, 2024
MiRAGeNews: Multimodal Realistic AI-Generated News Detection Paper • 2410.09045 • Published Oct 11, 2024 • 4
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation Paper • 2502.14846 • Published Feb 20 • 14