Safetensors
qwen2
TACLer-1.5B

We release TACLer-1.5B (🤗 HF Model), a hybrid reasoning model that supports both Thinking and NoThinking mode! We propose a model-tailored curriculum reinforcement learning framework that gradually increases the complexity of the data based on the model's proficiency in multi-stage RL training.

Our experiments show that: (i) TACLer reduces computational cost, cutting training compute by over 50% compared to long thinking models and reducing inference token usage by over 42% relative to the base model (DeepSeek-R1-Distill-Qwen-1.5B (R1-Qwen)); and (ii) TACLer improves accuracy by over 9% on the base model, consistently outperforming state-of-the-art Nothinking and Thinking baselines across four math datasets (MATH500, AMC, AIME 2024, and AIME 2025).

Code: https://github.com/laihuiyuan/tacler

Paper: https://arxiv.org/pdf/2601.21711

Citation

@article{lai-etal-2026-tacler,
    title = "TACLer: Tailored Curriculum Reinforcement Learning for Efficient Reasoning",
    author = "Lai, Huiyuan and Nissim, Malvina",
    journal={arXiv preprint arXiv:2601.21711},
    year={2026},
    url={https://arxiv.org/pdf/2601.21711}
}
Downloads last month
2
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for laihuiyuan/TACLer

Finetuned
(615)
this model

Dataset used to train laihuiyuan/TACLer

Paper for laihuiyuan/TACLer