Marc Lammers PRO

MarcusLammers

https://www.augustus.cloud

AI & ML interests

The future of compute isn’t linear, it is intelligent.

Recent Activity

commented on an article 21 days ago

Introducing Wikipedia Monthly: Fresh, Clean Wikipedia Dumps for NLP & AI Research

replied to omarkamali's post 21 days ago

Another month, another Wikipedia Monthly release! 🎃 Highlights of October's edition: · 🗣️ 341 languages · 📚 64.7M articles (+2.5%) · 📦 89.4GB of data (+3.3%) We are now sampling a random subset of each language with a reservoir sampling method to produce splits `1000`, `5000`, and `10000` in addition to the existing `train` split that contains all the data. Now you can load the english (or your favorite language) subset in seconds: `dataset = load_dataset("omarkamali/wikipedia-monthly", "latest.en", split="10000")` Happy data engineering! 🧰 https://huggingface.co/datasets/omarkamali/wikipedia-monthly

commented on an article 21 days ago

The Next Frontier: Large Language Models In Biology

View all activity

Organizations

commented on Introducing Wikipedia Monthly: Fresh, Clean Wikipedia Dumps for NLP & AI Research 21 days ago

This looks super useful, having fresh Wikipedia data every month will make a big difference. Thanks for building and sharing this!

replied to omarkamali's post 21 days ago

Great update! 🔥 Quick question , is there any way to filter the Wikipedia subsets by topic (e.g. science, history, tech), or is it all random sampling per language?

commented on The Next Frontier: Large Language Models In Biology 21 days ago

This is wild. Turning biology into language and AI into the translator!

commented on 🔐 LSTM (Long Short-Term Memory) — When AI finally stops forgetting! 🧠💾 21 days ago

This made LSTMs finally make sense to me! Clear, funny, and actually understandable for me👏

upvoted an article 23 days ago

Article

AXIS

•

23 days ago

• 1

published an article 23 days ago

Article

AXIS

•

23 days ago

• 1

published an article 28 days ago

Article

CHARIOT

•

28 days ago

updated a Space about 2 months ago

Augustus

⚡

published a Space about 2 months ago

Augustus

⚡

Marc Lammers PRO

AI & ML interests

Recent Activity

Organizations

MarcusLammers's activity

AXIS

AXIS

CHARIOT

Augustus

Augustus