agentlans/multilingual-sft
Viewer • Updated • 1.53M • 129
This is a fine-tuned version of Qwen 3 4B, optimized using the agentlans/multilingual-sft dataset to improve performance across 100+ languages and dialects.
Compared to the original Qwen 3 4B, this model focuses on clear, concise outputs, minimizing verbose reasoning. It's designed as a compact, multilingual alternative similar in behaviour to the Aya models.
agentlans/multilingual-sftrank=32, alpha=64, dropout=0.35e-5 1 (with gradient accumulation for effective batch size of 8) betas=(0.9, 0.999), epsilon=1e-8) peft==0.15.1 transformers==4.51.3 torch==2.6.0+cu124 datasets==3.5.0 tokenizers==0.21.1This model is released under the Apache 2.0 License.