dipta007 commited on
Commit
09d43a3
·
verified ·
1 Parent(s): b5707cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -25,10 +25,10 @@ datasets:
25
 
26
  **GanitLLM-4B_SFT** is a Bengali mathematical reasoning model trained with Supervised Fine-Tuning on the GANIT dataset. This model serves as the foundation for further RL training (GRPO/CGRPO). Key improvements over the base Qwen3-4B model:
27
 
28
- - **+4.8 accuracy** on Bn-MGSM benchmark (69.2 → 74.0)
29
- - **+4.1 accuracy** on Bn-MSVAMP benchmark (70.5 → 74.6)
30
  - **86.65% Bengali reasoning** (vs 14.79% for base model)
31
- - **80.5% fewer tokens** in generated solutions (943 → 184 words)
32
 
33
  > **Note**: This is the SFT-only checkpoint. For best results, use the RL-enhanced versions: [GanitLLM-4B_SFT_CGRPO](https://huggingface.co/dipta007/GanitLLM-4B_SFT_CGRPO) or [GanitLLM-4B_SFT_GRPO](https://huggingface.co/dipta007/GanitLLM-4B_SFT_GRPO).
34
 
 
25
 
26
  **GanitLLM-4B_SFT** is a Bengali mathematical reasoning model trained with Supervised Fine-Tuning on the GANIT dataset. This model serves as the foundation for further RL training (GRPO/CGRPO). Key improvements over the base Qwen3-4B model:
27
 
28
+ - **+4.80 accuracy** on Bn-MGSM benchmark (69.20 → 74.00)
29
+ - **+4.10 accuracy** on Bn-MSVAMP benchmark (70.50 → 74.60)
30
  - **86.65% Bengali reasoning** (vs 14.79% for base model)
31
+ - **80.5% fewer words** in generated solutions (943 → 184 words)
32
 
33
  > **Note**: This is the SFT-only checkpoint. For best results, use the RL-enhanced versions: [GanitLLM-4B_SFT_CGRPO](https://huggingface.co/dipta007/GanitLLM-4B_SFT_CGRPO) or [GanitLLM-4B_SFT_GRPO](https://huggingface.co/dipta007/GanitLLM-4B_SFT_GRPO).
34