Mistral Small 4 Collection A state-of-the-art model, open-weight, with a granular Mixture-of-Experts architecture that fuses instruct, reasoning and agentic skills. • 3 items • Updated 21 days ago • 63
Pipette: Automatic Fine-grained Large Language Model Training Configurator for Real-World Clusters Paper • 2405.18093 • Published May 28, 2024 • 1