PlanePaper
/

LEAD-14B

Model card Files Files and versions

PlanePaper commited on Apr 17, 2025

Commit

2872238

·

verified ·

1 Parent(s): e5d3d75

Update README.md

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -40,6 +40,14 @@ For complete details, codebase, and usage examples, please visit our GitHub repo
 ---
 ## 📖 Citation
 If you find our work useful, please cite it as:

 ---
+## 📦 Dataset: GRPO-LEAD-SFTData
+We release [**GRPO-LEAD-SFTData**](https://huggingface.co/datasets/PlanePaper/GRPO-LEAD-SFTData), a curated collection of **12,153** high-quality mathematical reasoning samples for supervised fine-tuning. Generated via [**QwQ-32B**](https://huggingface.co/Qwen/QwQ-32B).
+Derived primarily from the **DeepScaler** dataset ([DeepScaler](https://github.com/agentica-project/rllm)), we retain only examples with **difficulty > 1**, targeting challenging problem-solving scenarios. All entries are structured for seamless integration with [**LLaMA Factory**](https://github.com/hiyouga/LLaMA-Factory) and follow a standardized SFT-ready format.
+Used as the training data for GRPO-LEAD’s supervised fine-tuning stage, this dataset is able to increase the model's base capability in solving mathematical problems.,
+---
 ## 📖 Citation
 If you find our work useful, please cite it as: