AI & ML interests

The AI community building the future.

Recent Activity

sayakpaul  updated a dataset about 9 hours ago
huggingface/diffusers-metadata
CharlieBoyer  updated a Space about 16 hours ago
huggingface/okta-saml-integration-guide
CharlieBoyer  updated a Space about 16 hours ago
huggingface/okta-saml-integration-guide
View all activity

Articles

evalstate 
posted an update about 9 hours ago
view post
Post
61
Hugging Face MCP Server v0.2.35
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

$HF_TOKEN is expanded in Jobs Secrets environment variables.
AdinaY 
posted an update about 10 hours ago
view post
Post
82
Ming-flash-omni Preview 🚀 Multimodal foundation model from AntGroup

inclusionAI/Ming-flash-omni-Preview

✨ Built on Ling-Flash-2.0: 10B total/6B active
✨ Generative segmentation-as-editing
✨ SOTA contextual & dialect ASR
✨ High-fidelity image generation
AdinaY 
posted an update about 12 hours ago
view post
Post
86

Glyph 🔥 a framework that scales context length by compressing text into images and processing them with vision–language models, released by Z.ai.

Paper:https://huggingface.co/papers/2510.17800
Model:https://huggingface.co/zai-org/Glyph

✨ Compresses long sequences visually to bypass token limits
✨ Reduces computational and memory costs
✨ Preserves meaning through multimodal encoding
✨ Built on GLM-4.1V-9B-Base
AdinaY 
posted an update 5 days ago
view post
Post
2490
HunyuanWorld Mirror🔥a versatile feed forward model for universal 3D world reconstruction by Tencent

tencent/HunyuanWorld-Mirror

✨ Any prior in → 3D world out
✨ Mix camera, intrinsics, depth as priors
✨ Predict point clouds, normals, Gaussians & more in one pass
✨ Unified architecture for all 3D task
evalstate 
posted an update 6 days ago
view post
Post
253
Hugging Face MCP Server v0.2.33
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Allow discovery of Product Documentation Library via the Search tool.
andito 
posted an update 6 days ago
view post
Post
1536
Finally, our new paper is out! "𝗙𝗶𝗻𝗲𝗩𝗶𝘀𝗶𝗼𝗻: 𝗢𝗽𝗲𝗻 𝗗𝗮𝘁𝗮 𝗜𝘀 𝗔𝗹𝗹 𝗬𝗼𝘂 𝗡𝗲𝗲𝗱"! 🥳
FineVision: Open Data Is All You Need (2510.17269)

If you've ever trained a VLM, you know this problem: nobody shares their data mixtures. It's a black box, making replicating SOTA work impossible.
We wanted to change that.

FineVision unifies 200 sources into 24 million samples. With 17.3 million images and 9.5 billion answer tokens, it's the largest open resource of its kind.

In the paper, we share how we built it:
🔍 finding and cleaning data at scale
🧹 removing excessive duplicates across sources
🤗 decontaminating against 66 public benchmarks

My favorite part is Figure 6 (in the video!). It's our visual diversity analysis. It shows that FineVision isn't just bigger; it's more balanced and conceptually richer than other open datasets.
NVIDIA's Eagle 2 paper highlighted just how critical this visual diversity is, and our results confirm it: models trained on FineVision consistently outperform those trained on any other open dataset on 11 benchmarks!

🎉 To celebrate the paper, I’m also releasing a concatenated and shuffled version of the full dataset! 👉HuggingFaceM4/FineVision_full_shuffled

It’s ready to stream, so you can start training your own models right away:

from datasets import load_dataset
d = load_dataset("HuggingFaceM4/FineVision_full_shuffled", split="train", streaming=True)
print(next(iter(d)))

A big shoutout to the first authors: Luis Wiedmann and Orr Zohar. They are rockstars!
AdinaY 
posted an update 10 days ago
view post
Post
585
PaddleOCR VL🔥 0.9B Multilingual VLM by Baidu

PaddlePaddle/PaddleOCR-VL

✨ Ultra-efficient NaViT + ERNIE-4.5 architecture
✨ Supports 109 languages 🤯
✨ Accurately recognizes text, tables, formulas & charts
✨ Fast inference and lightweight for deployment
AdinaY 
posted an update 11 days ago
multimodalart 
posted an update 12 days ago
view post
Post
1550
Want to iterate on a Hugging Face Space with an LLM?

Now you can easily convert any HF entire repo (Model, Dataset or Space) to a text file and feed it to a language model!

multimodalart/repo2txt
AdinaY 
posted an update 12 days ago
evalstate 
posted an update 13 days ago
view post
Post
220
Hugging Face MCP Server v0.2.31
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- OpenAI Apps SDK Support for Gradio Content Generation spaces
AdinaY 
posted an update 14 days ago
view post
Post
462
Ring-1T🔥 the trillion-parameter thinking model released by Ant group, the company behind Alipay

inclusionAI/Ring-1T

✨ 1T params (50B active)- MIT license
✨ 128K context (YaRN)
✨ RLVR, Icepop, and ASystem make trillion-scale RL stable
AdinaY 
posted an update 18 days ago
view post
Post
490
KAT-Dev-72B-Exp🔥 Kuaishou's ( the company behind Kring AI ) new open model for software engineering

Kwaipilot/KAT-Dev-72B-Exp

✨ 72B - Apache2.0
✨ Redesigned attention kernel & training engine for efficient context-aware RL
✨ 74.6% accuracy on SWE-Bench Verified
giadap 
posted an update 18 days ago
view post
Post
4366
🌎 AI ethics and sustainability are two sides of the same coin.

In our new blog post with Dr. Sasha Luccioni, we argue that separating them (as is too often the case) means missing the bigger picture of how AI systems impact both people and the planet.

Ethical and sustainable AI development can’t be pursued in isolation. The same choices that affect who benefits or is harmed by AI systems also determine how much energy and resources they consume.

We explore how two key concepts, evaluation and transparency, can serve as bridges between these domains:

📊 Evaluation, by moving beyond accuracy or performance metrics to include environmental and social costs, as we’ve done with tools like the AI Energy Score.

🔍 Transparency, by enabling reproducibility, accountability, and environmental reporting through open tools like the Environmental Transparency Space.

AI systems mirror our priorities. If we separate ethics from sustainability, we risk building technologies that are efficient but unjust, or fair but unsustainable.

Read our blog post here: https://huggingface.co/blog/sasha/ethics-sustainability

AIEnergyScore/Leaderboard
sasha/environmental-transparency
  • 1 reply
·