Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation Paper • 2512.10949 • Published 17 days ago • 45
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control Paper • 2508.21112 • Published Aug 28 • 77
ViStoryBench: Comprehensive Benchmark Suite for Story Visualization Paper • 2505.24862 • Published May 30 • 30
Exploring the Potential of Encoder-free Architectures in 3D LMMs Paper • 2502.09620 • Published Feb 13 • 26
SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model Paper • 2501.15830 • Published Jan 27 • 13