Lost in Backpropagation: The LM Head is a Gradient Bottleneck
Paper
• 2603.10145 • Published
• 3
NLP, Digital Humanities
Gaperon: A Peppered English-French Generative Language Model Suite
LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens