Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Paper • 2511.04962 • Published Nov 7, 2025 • 54
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use Paper • 2411.10323 • Published Nov 15, 2024 • 34
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction Paper • 2412.04454 • Published Dec 5, 2024 • 71
Ponder & Press: Advancing Visual GUI Agent towards General Computer Control Paper • 2412.01268 • Published Dec 2, 2024 • 1
UI Agent Collection a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robotics • 438 items • Updated 23 days ago • 66
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5, 2025 • 297
Models Used in HackerNoon Publishing System Collection HackerNoon.com’s content management system empowers a small team to manage tens of thousands of writers, advertisers, & millions of readers 🙏 🤖 🙏🤖 • 16 items • Updated Jan 23, 2025 • 21
view article Article Train custom AI models with the trainer API and adapt them to 🤗 Jun 29, 2024 • 32
Imp: Highly Capable Large Multimodal Models for Mobile Devices Paper • 2405.12107 • Published May 20, 2024 • 29
OpenCulture Collection A multilingual dataset of public domain books and newspapers. • 27 items • Updated Nov 6, 2024 • 132
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14, 2024 • 129
Design2Code: How Far Are We From Automating Front-End Engineering? Paper • 2403.03163 • Published Mar 5, 2024 • 98
Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters Paper • 2403.02677 • Published Mar 5, 2024 • 18