spatial training

community

AI & ML interests

None defined yet.

Recent Activity

array authored a paper about 2 months ago

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

array authored a paper about 2 months ago

GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation

ellisbrown authored a paper about 2 months ago

Cambrian-S: Towards Spatial Supersensing in Video

View all activity

array

authored 2 papers about 2 months ago

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

Paper • 1909.04696 • Published Sep 10, 2019

GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation

Paper • 2505.13441 • Published May 19 • 1

ellisbrown

authored 3 papers about 2 months ago

Cambrian-S: Towards Spatial Supersensing in Video

Paper • 2511.04670 • Published Nov 6 • 37

Benchmark Designers Should "Train on the Test Set" to Expose Exploitable Non-Visual Shortcuts

Paper • 2511.04655 • Published Nov 6 • 7

SIMS-V: Simulated Instruction-Tuning for Spatial Video Understanding

Paper • 2511.04668 • Published Nov 6 • 4

array

authored a paper about 2 months ago

SIMS-V: Simulated Instruction-Tuning for Spatial Video Understanding

Paper • 2511.04668 • Published Nov 6 • 4

ellisbrown

authored a paper 5 months ago

SAT: Dynamic Spatial Aptitude Training for Multimodal Language Models

Paper • 2412.07755 • Published Dec 10, 2024 • 2

array

authored a paper 8 months ago

SAT: Dynamic Spatial Aptitude Training for Multimodal Language Models

Paper • 2412.07755 • Published Dec 10, 2024 • 2

ellisbrown

authored a paper over 1 year ago

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Paper • 2406.16860 • Published Jun 24, 2024 • 63

ellisbrown

authored 2 papers almost 2 years ago

Your Diffusion Model is Secretly a Zero-Shot Classifier

Paper • 2303.16203 • Published Mar 28, 2023

V-IRL: Grounding Virtual Intelligence in Real Life

Paper • 2402.03310 • Published Feb 5, 2024 • 16

ellisbrown

authored a paper over 2 years ago

Internet Explorer: Targeted Representation Learning on the Open Web

Paper • 2302.14051 • Published Feb 27, 2023 • 1

array

authored a paper over 2 years ago

COLA: How to adapt vision-language models to Compose Objects Localized with Attributes?

Paper • 2305.03689 • Published May 5, 2023 • 3