Fine-tuning Qwen/Qwen3-4B-Instruct-2507 to Surpass the Existing Reasoning Model

#1438
by ll028987 - opened

Can you fine-tune the model https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507
using the dataset https://huggingface.co/datasets/ll028987/human-reasoning-Advanced-V1.0?
I know there is another reasoning version, but the idea is that this non-reasoning version, after fine-tuning, reasons better than the existing reasoning version.
Thanks πŸ™

Sign up or log in to comment