Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025)
Joya Chen PRO
chenjoya
AI & ML interests
Video LLM
Recent Activity
upvoted
a
paper
about 11 hours ago
FARMER: Flow AutoRegressive Transformer over Pixels
liked
a dataset
6 days ago
MikhailT/lj-speech
liked
a dataset
7 days ago
zeyun-zhong/LLaVA-Video-216KQA