The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain Paper • 2509.26507 • Published Sep 30, 2025 • 542
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism Paper • 1909.08053 • Published Sep 17, 2019 • 5
EvalYaks: Instruction Tuning Datasets and LoRA Fine-tuned Models for Automated Scoring of CEFR B2 Speaking Assessment Transcripts Paper • 2408.12226 • Published Aug 22, 2024
Can Small Language Models Learn, Unlearn, and Retain Noise Patterns? Paper • 2407.00996 • Published Jul 1, 2024
anjulRajendraSharma/wav2vec2-indian-english Automatic Speech Recognition • Updated Jan 12, 2024 • 33 • 1