13 15 19

Garreth Lee

garrethlee

AI & ML interests

None yet

Recent Activity

liked a Space 3 days ago

HuggingFaceTB/smol-training-playbook

liked a dataset about 2 months ago

HuggingFaceM4/FineVision

liked a model about 2 months ago

google/embeddinggemma-300m

View all activity

Organizations

liked a Space 3 days ago

974

The Smol Training Playbook: The Secrets to Building World-Class LLMs

📝

liked a dataset about 2 months ago

HuggingFaceM4/FineVision

Viewer • Updated 13 days ago • 24.2M • 242k • 426

liked a model about 2 months ago

google/embeddinggemma-300m

liked a dataset 3 months ago

nvidia/Granary

Viewer • Updated Aug 14 • 116M • 4.27k • 153

liked a Space 6 months ago

1.7k

Dia 1.6B

👯

Generate realistic dialogue from a script, using Dia!

liked a Space 9 months ago

3.4k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked a model 10 months ago

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27 • 414k • • 12.8k

liked a dataset 11 months ago

HuggingFaceFW/fineweb-2

Viewer • Updated 6 days ago • 4.48B • 99k • 677

liked a Space 11 months ago

102

Number Tokenization Blog

📈

Explore how tokenization affects arithmetic in LLMs

liked a Space 12 months ago

Hub LFS Analysis

📈

An analysis of LFS files on the Hub.

liked a model 12 months ago

GoToCompany/gemma2-9b-cpt-sahabatai-v1-instruct

9B • Updated Nov 6, 2024 • 1.18k • 45

liked a Space 12 months ago

Sahabat-AI Chatbot (Gemma2 9b)

😻

Chatbot

liked 2 datasets 12 months ago

indolem/IndoMMLU

Updated Oct 11, 2023 • 118 • 18

PleIAs/common_corpus

Viewer • Updated Jun 10 • 470M • 15.1k • 314

liked 4 Spaces about 1 year ago

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

📝

Evaluate multilingual models using FineTasks

125

TxT360: Trillion Extracted Text

📖

Explore and utilize a large, deduplicated text dataset for LLM training

986

Model Memory Utility

🚀

Calculate vRAM needed for model training and inference

1.14k

FineWeb: decanting the web for the finest text data at scale

🍷

Generate high-quality text data for LLMs using FineWeb

liked a model over 1 year ago

mistralai/Mistral-7B-Instruct-v0.2

Text Generation • 7B • Updated Jul 24 • 3.04M • • 3k