Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
0.7
TFLOPS
11
37
415
Matricardi Fabio
FM-1976
Follow
Kaytheist's profile picture
Theartplug's profile picture
21world's profile picture
18 followers
·
99 following
https://medium.com/@fabio.matricardi
ThePoorGpuGuy
fabiomatricardi
AI & ML interests
control system engineering, AI, LLM with python. ThePoorGPUguy on substack
Recent Activity
liked
a model
5 days ago
Tiiny/SmallThinker-3B-Preview
liked
a model
9 days ago
LiquidAI/LFM2-2.6B-Exp
reacted
to
codelion
's
post
with 🚀
9 days ago
Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models! Key findings from our research on optimal architectures for small language models: → Depth beats width: 32 layers outperforms 12 layers at the same parameter count → Best-in-class factuality: 47.5% on TruthfulQA → 10x training efficiency using WSD (Warmup-Stable-Decay) conversion → Canon layers add only 0.13% parameters but improve reasoning We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens. Blog: https://huggingface.co/blog/codelion/optimal-model-architecture Model: https://huggingface.co/codelion/dhara-70m
View all activity
Organizations
None yet
FM-1976
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
5 days ago
Tiiny/SmallThinker-3B-Preview
Text Generation
•
3B
•
Updated
Jan 16, 2025
•
21.5k
•
416
liked
2 models
9 days ago
LiquidAI/LFM2-2.6B-Exp
Text Generation
•
3B
•
Updated
9 days ago
•
10.1k
•
303
codelion/dhara-70m
Text Generation
•
71.3M
•
Updated
5 days ago
•
3.37k
•
30
liked
2 models
12 days ago
google/t5gemma-2-1b-1b
Image-Text-to-Text
•
2B
•
Updated
17 days ago
•
10.4k
•
61
facebook/sam-audio-small
Updated
5 days ago
•
8.63k
•
62
liked
a model
22 days ago
hitonet/hito-1.7b
Text Generation
•
2B
•
Updated
24 days ago
•
425
•
7
liked
4 models
23 days ago
ByteDance-Seed/Seed-X-PPO-7B
Translation
•
Updated
Jul 28, 2025
•
14.8k
•
286
NeuML/bert-hash-pico
Updated
Oct 9, 2025
•
20
•
3
liu-nlp/hyperllama-180m-multilingual-1x
Text Generation
•
0.2B
•
Updated
23 days ago
•
51
•
1
TitleOS/Lightning-1.7B
Text Generation
•
2B
•
Updated
25 days ago
•
56
•
3
liked
2 models
26 days ago
jhu-clsp/ettin-decoder-68m
Fill-Mask
•
Updated
Jul 16, 2025
•
68
•
1
jhu-clsp/ettin-encoder-17m
Fill-Mask
•
Updated
Jul 16, 2025
•
1.25k
•
11
liked
5 models
28 days ago
nvidia/parakeet-tdt-0.6b-v3
Automatic Speech Recognition
•
Updated
Nov 27, 2025
•
52k
•
506
UsefulSensors/moonshine
Automatic Speech Recognition
•
Updated
Nov 30, 2025
•
1
•
86
shoumenchougou/RWKV7-G1a-0.1B-GGUF
0.2B
•
Updated
Oct 16, 2025
•
232
•
3
shoumenchougou/RWKV7-G1b-1.5B-GGUF
2B
•
Updated
Dec 4, 2025
•
112
•
1
onnx-community/ettin-encoder-32m-ONNX
Fill-Mask
•
Updated
28 days ago
•
21
•
1
liked
3 models
29 days ago
LucidityAI/Astral-0.6B-Flash-Coder
0.6B
•
Updated
Oct 5, 2025
•
13
•
1
keras/moonshine_tiny_en
Updated
Jun 17, 2025
•
11
•
1
mradermacher/aquif-3.5-Nano-1B-GGUF
2B
•
Updated
Dec 2, 2025
•
332
•
1
Load more