Matricardi Fabio's picture

Matricardi Fabio

FM-1976

·

https://medium.com/@fabio.matricardi

AI & ML interests

control system engineering, AI, LLM with python. ThePoorGPUguy on substack

Recent Activity

liked a model 5 days ago

Tiiny/SmallThinker-3B-Preview

liked a model 9 days ago

LiquidAI/LFM2-2.6B-Exp

reacted to codelion's post with 🚀 9 days ago

Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models! Key findings from our research on optimal architectures for small language models: → Depth beats width: 32 layers outperforms 12 layers at the same parameter count → Best-in-class factuality: 47.5% on TruthfulQA → 10x training efficiency using WSD (Warmup-Stable-Decay) conversion → Canon layers add only 0.13% parameters but improve reasoning We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens. Blog: https://huggingface.co/blog/codelion/optimal-model-architecture Model: https://huggingface.co/codelion/dhara-70m

View all activity

Organizations

None yet

liked a model 5 days ago

Tiiny/SmallThinker-3B-Preview

Text Generation • 3B • Updated Jan 16, 2025 • 21.5k • 416

liked 2 models 9 days ago

LiquidAI/LFM2-2.6B-Exp

Text Generation • 3B • Updated 9 days ago • 10.1k • 303

codelion/dhara-70m

Text Generation • 71.3M • Updated 5 days ago • 3.37k • 30

liked 2 models 12 days ago

google/t5gemma-2-1b-1b

Image-Text-to-Text • 2B • Updated 17 days ago • 10.4k • 61

facebook/sam-audio-small

Updated 5 days ago • 8.63k • 62

liked a model 22 days ago

hitonet/hito-1.7b

Text Generation • 2B • Updated 24 days ago • 425 • 7

liked 4 models 23 days ago

ByteDance-Seed/Seed-X-PPO-7B

Translation • Updated Jul 28, 2025 • 14.8k • 286

NeuML/bert-hash-pico

Updated Oct 9, 2025 • 20 • 3

liu-nlp/hyperllama-180m-multilingual-1x

Text Generation • 0.2B • Updated 23 days ago • 51 • 1

TitleOS/Lightning-1.7B

Text Generation • 2B • Updated 25 days ago • 56 • 3

liked 2 models 26 days ago

jhu-clsp/ettin-decoder-68m

Fill-Mask • Updated Jul 16, 2025 • 68 • 1

jhu-clsp/ettin-encoder-17m

Fill-Mask • Updated Jul 16, 2025 • 1.25k • 11

liked 5 models 28 days ago

nvidia/parakeet-tdt-0.6b-v3

Automatic Speech Recognition • Updated Nov 27, 2025 • 52k • 506

UsefulSensors/moonshine

Automatic Speech Recognition • Updated Nov 30, 2025 • 1 • 86

shoumenchougou/RWKV7-G1a-0.1B-GGUF

0.2B • Updated Oct 16, 2025 • 232 • 3

shoumenchougou/RWKV7-G1b-1.5B-GGUF

2B • Updated Dec 4, 2025 • 112 • 1

onnx-community/ettin-encoder-32m-ONNX

Fill-Mask • Updated 28 days ago • 21 • 1

liked 3 models 29 days ago

LucidityAI/Astral-0.6B-Flash-Coder

0.6B • Updated Oct 5, 2025 • 13 • 1

keras/moonshine_tiny_en

Updated Jun 17, 2025 • 11 • 1

mradermacher/aquif-3.5-Nano-1B-GGUF

2B • Updated Dec 2, 2025 • 332 • 1