FOCUS: Effective Embedding Initialization for Specializing Pretrained Multilingual Models on a Single Language Paper • 2305.14481 • Published May 23, 2023 • 2
AweDist: Attention-aware Embedding Distillation for New Input Token Embeddings Paper • 2505.20133 • Published May 26, 2025 • 1
I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token Paper • 2412.06676 • Published Dec 9, 2024 • 9