3 17 68

Lei Mingcong

SP4595

https://sp4595.github.io/

SP4595

AI & ML interests

None yet

Recent Activity

upvoted a paper 9 days ago

Next-Embedding Prediction Makes Strong Vision Learners

upvoted a paper 25 days ago

GoRL: An Algorithm-Agnostic Framework for Online Reinforcement Learning with Generative Policies

upvoted a paper about 1 month ago

Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward

View all activity

Organizations

upvoted a paper 9 days ago

Next-Embedding Prediction Makes Strong Vision Learners

Paper • 2512.16922 • Published 9 days ago • 79

upvoted a paper 25 days ago

GoRL: An Algorithm-Agnostic Framework for Online Reinforcement Learning with Generative Policies

Paper • 2512.02581 • Published 26 days ago • 14

upvoted a paper about 1 month ago

Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward

Paper • 2511.20561 • Published Nov 25 • 31

upvoted 2 papers 5 months ago

RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems

Paper • 2508.01415 • Published Aug 2 • 7

ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning

Paper • 2507.16815 • Published Jul 22 • 40

upvoted an article 7 months ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

Jun 3

•

299

upvoted a paper 7 months ago

RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete

Paper • 2502.21257 • Published Feb 28 • 2

upvoted a paper 8 months ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 306

upvoted a paper 9 months ago

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31 • 301

upvoted an article 10 months ago

Article

Open R1: Update #3

Mar 11

•

296

upvoted 2 papers 10 months ago

CLEA: Closed-Loop Embodied Agent for Enhancing Task Execution in Dynamic Environments

Paper • 2503.00729 • Published Mar 2 • 3

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published Mar 3 • 89

upvoted an article 10 months ago

Article

Mixture of Experts Explained

Dec 11, 2023

•

1.02k

upvoted 2 papers 10 months ago

DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References

Paper • 2502.09614 • Published Feb 13 • 9

STMA: A Spatio-Temporal Memory Agent for Long-Horizon Embodied Task Planning

Paper • 2502.10177 • Published Feb 14 • 6

upvoted a collection 11 months ago

Qwen2.5-VL

Collection

Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 550

upvoted a paper about 1 year ago

MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for Superior Planning and Decision-Making

Paper • 2409.16686 • Published Sep 25, 2024 • 10

Lei Mingcong

AI & ML interests

Recent Activity

Organizations

SP4595's activity

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

Open R1: Update #3

Mixture of Experts Explained