dipta007
/

GanitLLM-4B-SFT

Text Generation

text-generation-inference

Model card Files Files and versions

dipta007 commited on Jan 10

Commit

09d43a3

·

verified ·

1 Parent(s): b5707cc

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -25,10 +25,10 @@ datasets:
 **GanitLLM-4B_SFT** is a Bengali mathematical reasoning model trained with Supervised Fine-Tuning on the GANIT dataset. This model serves as the foundation for further RL training (GRPO/CGRPO). Key improvements over the base Qwen3-4B model:
-- **+4.8 accuracy** on Bn-MGSM benchmark (69.2 → 74.0)
-- **+4.1 accuracy** on Bn-MSVAMP benchmark (70.5 → 74.6)
 - **86.65% Bengali reasoning** (vs 14.79% for base model)
-- **80.5% fewer tokens** in generated solutions (943 → 184 words)
 > **Note**: This is the SFT-only checkpoint. For best results, use the RL-enhanced versions: [GanitLLM-4B_SFT_CGRPO](https://huggingface.co/dipta007/GanitLLM-4B_SFT_CGRPO) or [GanitLLM-4B_SFT_GRPO](https://huggingface.co/dipta007/GanitLLM-4B_SFT_GRPO).

 **GanitLLM-4B_SFT** is a Bengali mathematical reasoning model trained with Supervised Fine-Tuning on the GANIT dataset. This model serves as the foundation for further RL training (GRPO/CGRPO). Key improvements over the base Qwen3-4B model:
+- **+4.80 accuracy** on Bn-MGSM benchmark (69.20 → 74.00)
+- **+4.10 accuracy** on Bn-MSVAMP benchmark (70.50 → 74.60)
 - **86.65% Bengali reasoning** (vs 14.79% for base model)
+- **80.5% fewer words** in generated solutions (943 → 184 words)
 > **Note**: This is the SFT-only checkpoint. For best results, use the RL-enhanced versions: [GanitLLM-4B_SFT_CGRPO](https://huggingface.co/dipta007/GanitLLM-4B_SFT_CGRPO) or [GanitLLM-4B_SFT_GRPO](https://huggingface.co/dipta007/GanitLLM-4B_SFT_GRPO).