AlpachinoNLP/u2Qwen3-4B-Instruct
Image-to-Text
•
Updated
•
9
•
1
None defined yet.
Seeing the Forest and the Trees: Query-Aware Tokenizer for Long-Video Multimodal Language Models
$μ^2$Tokenizer: Differentiable Multi-Scale Multi-Modal Tokenizer for Radiology Report Generation