Teaching Old Tokenizers New Words: Efficient Tokenizer Adaptation for Pre-trained Models Paper • 2512.03989 • Published Dec 3, 2025
EstLLM: Enhancing Estonian Capabilities in Multilingual LLMs via Continued Pretraining and Post-Training Paper • 2603.02041 • Published 11 days ago
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer Paper • 2404.04042 • Published Apr 5, 2024 • 2
LLMs for Extremely Low-Resource Finno-Ugric Languages Paper • 2410.18902 • Published Oct 24, 2024 • 3