File size: 325 Bytes

291d03a

---
license: apache-2.0
---

# 🧠Smoller-reason2.1🧠

I have found making Andy-4-micro that a 1.5b model can learn a lot of stuff really well, if you give it the right environment. So, I have decided to take Qwen2.5 1.5b, and make it a reasoning model using GRPO as well as stuff from DeepSeek-R1 and QwQ in PPO training.