MultiVerse: A Multi-Turn Conversation Benchmark for Evaluating Large Vision and Language Models Paper • 2510.16641 • Published 11 days ago • 4
When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling Paper • 2510.15346 • Published 12 days ago • 32
Map the Flow: Revealing Hidden Pathways of Information in VideoLLMs Paper • 2510.13251 • Published 14 days ago • 12
Unified Reinforcement and Imitation Learning for Vision-Language Models Paper • 2510.19307 • Published 7 days ago • 25
Are they lovers or friends? Evaluating LLMs' Social Reasoning in English and Korean Dialogues Paper • 2510.19028 • Published 8 days ago • 6
Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis Paper • 2411.01156 • Published Nov 2, 2024 • 11
Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published 12 days ago • 47
PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations Paper • 2412.05994 • Published Dec 8, 2024 • 19
Underwater SONAR Image Classification and Analysis using LIME-based Explainable Artificial Intelligence Paper • 2408.12837 • Published Aug 23, 2024 • 1
Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism via Dual Diffusion Models and GPT Prompting Paper • 2410.08612 • Published Oct 11, 2024 • 2
FlashWorld: High-quality 3D Scene Generation within Seconds Paper • 2510.13678 • Published 14 days ago • 69
TokDrift: When LLM Speaks in Subwords but Code Speaks in Grammar Paper • 2510.14972 • Published 13 days ago • 29
Large Language Models Do NOT Really Know What They Don't Know Paper • 2510.09033 • Published 19 days ago • 16
Attention Is All You Need for KV Cache in Diffusion LLMs Paper • 2510.14973 • Published 13 days ago • 36
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published 16 days ago • 160