yanpeng_sun's picture

5 6 2

yanpeng_sun

syp115

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 18 days ago

Towards Better Dental AI: A Multimodal Benchmark and Instruction Dataset for Panoramic X-ray Analysis

authored a paper 28 days ago

Artemis: Structured Visual Reasoning for Perception Policy Learning

upvoted a paper 28 days ago

Visual Position Prompt for MLLM based Visual Grounding

View all activity

Organizations

None yet

authored a paper 28 days ago

Artemis: Structured Visual Reasoning for Perception Policy Learning

Paper • 2512.01988 • Published 30 days ago • 1

authored 8 papers about 1 month ago

Improving Multi-modal Large Language Model through Boosting Vision Capabilities

Paper • 2410.13733 • Published Oct 17, 2024

Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMs

Paper • 2501.06430 • Published Jan 11

MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams

Paper • 2503.20745 • Published Mar 26 • 1

Visual Position Prompt for MLLM based Visual Grounding

Paper • 2503.15426 • Published Mar 19 • 2

DeRIS: Decoupling Perception and Cognition for Enhanced Referring Image Segmentation through Loopback Synergy

Paper • 2507.01738 • Published Jul 2

Towards Better Dental AI: A Multimodal Benchmark and Instruction Dataset for Panoramic X-ray Analysis

Paper • 2509.09254 • Published Sep 11 • 6

FullAnno: A Data Engine for Enhancing Image Comprehension of MLLMs

Paper • 2409.13540 • Published Sep 20, 2024

Agentic Learner with Grow-and-Refine Multimodal Semantic Memory

Paper • 2511.21678 • Published Nov 26 • 12

authored a paper about 1 year ago

Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception

Paper • 2412.14233 • Published Dec 18, 2024 • 6

authored a paper over 1 year ago

CSGO: Content-Style Composition in Text-to-Image Generation

Paper • 2408.16766 • Published Aug 29, 2024 • 18