Charlie's picture

19

Charlie

Charliezyl

AI & ML interests

None yet

Recent Activity

upvoted a paper about 21 hours ago

Rethinking Composed Image Retrieval Evaluation: A Fine-Grained Benchmark from Image Editing

upvoted a paper 6 months ago

CellForge: Agentic Design of Virtual Cell Models

upvoted a paper 6 months ago

Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers

View all activity

Organizations

None yet

upvoted a paper about 21 hours ago

Rethinking Composed Image Retrieval Evaluation: A Fine-Grained Benchmark from Image Editing

Paper • 2601.16125 • Published 1 day ago • 13

upvoted 2 papers 6 months ago

CellForge: Agentic Design of Virtual Cell Models

Paper • 2508.02276 • Published Aug 4, 2025 • 39

Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers

Paper • 2507.10787 • Published Jul 14, 2025 • 13

upvoted 5 papers 7 months ago

Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers

Paper • 2507.06223 • Published Jul 8, 2025 • 14

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

Paper • 2507.06229 • Published Jul 8, 2025 • 76

SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks

Paper • 2507.01001 • Published Jul 1, 2025 • 46

Can LLMs Generate High-Quality Test Cases for Algorithm Problems? TestCase-Eval: A Systematic Evaluation of Fault Coverage and Exposure

Paper • 2506.12278 • Published Jun 13, 2025 • 16

MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation

Paper • 2506.14028 • Published Jun 16, 2025 • 93

upvoted 5 papers 8 months ago

ZeroGUI: Automating Online GUI Learning at Zero Human Cost

Paper • 2505.23762 • Published May 29, 2025 • 45

The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason

Paper • 2505.22653 • Published May 28, 2025 • 43

FAMA: The First Large-Scale Open-Science Speech Foundation Model for English and Italian

Paper • 2505.22759 • Published May 28, 2025 • 19

Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Paper • 2505.23747 • Published May 29, 2025 • 69

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21, 2025 • 54

upvoted 4 papers 10 months ago

Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

Paper • 2503.24379 • Published Mar 31, 2025 • 76

Z1: Efficient Test-time Scaling with Code

Paper • 2504.00810 • Published Apr 1, 2025 • 26

AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

Paper • 2503.19693 • Published Mar 25, 2025 • 76

PHYSICS: Benchmarking Foundation Models on University-Level Physics Problem Solving

Paper • 2503.21821 • Published Mar 26, 2025 • 21

upvoted a paper 11 months ago

IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval

Paper • 2503.04644 • Published Mar 6, 2025 • 21

upvoted a paper about 1 year ago

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published Jan 21, 2025 • 84