MonkeyOCR v1.5 Technical Report: Unlocking Robust Document Parsing for Complex Patterns Paper • 2511.10390 • Published Nov 13
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21 • 259
SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting Paper • 2504.09966 • Published Apr 14
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm Paper • 2506.05218 • Published Jun 5 • 2
R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models Paper • 2410.17885 • Published Oct 23, 2024