InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search Paper • 2512.18745 • Published 10 days ago • 10
LongVideoAgent: Multi-Agent Reasoning with Long Videos Paper • 2512.20618 • Published 8 days ago • 52
3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection Paper • 2410.01647 • Published Oct 2, 2024 • 31