LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry Paper • 2512.19629 • Published Dec 22, 2025 • 26
G^2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning Paper • 2511.21688 • Published Nov 26, 2025 • 8
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding Paper • 2507.07984 • Published Jul 10, 2025 • 43
StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling Paper • 2507.05240 • Published Jul 7, 2025 • 48