| This is the distilled educational Qwen2.5-7B-Instruct based on EduBench. | |
| - [paper](https://arxiv.org/abs/2505.16160) | |
| - [github](https://github.com/DIRECT-BIT/EduBench) | |
| ## Model Details | |
| **Model Name**: EDU-Qwen2.5-7B | |
| **Model Type**: Distilled instruction-tuned language model (7B parameters) | |
| **Base Model**: [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) | |
| ## Training Data | |
| To fully leverage the strengths of different response generation models across various scenarios, we adopt a multi-source distillation pipeline. | |
| For each task, we select the best-performing model on the test set as the response generator, using it to answer educational domain questions and construct the training dataset for the distillation model. | |
| Through the distillation pipeline, we obtain a training set of 17,000 samples covering various subtasks across all 9 educational scenarios. | |
| More details are provided in Appendix K of our [paper](https://arxiv.org/abs/2505.16160) | |
| ## Performance | |
| <div align="center"> | |
| <img src="performance.png" alt="Framework" width="1200"/> | |
| <br> | |
| </div> | |
| ## 🫣Citation | |
| If you find our benchmark, evaluation pipeline or models useful or interesting, please cite our paper. | |
| ``` | |
| @misc{xu2025edubenchcomprehensivebenchmarkingdataset, | |
| title={EduBench: A Comprehensive Benchmarking Dataset for Evaluating Large Language Models in Diverse Educational Scenarios}, | |
| author={Bin Xu and Yu Bai and Huashan Sun and Yiguan Lin and Siming Liu and Xinyue Liang and Yaolin Li and Yang Gao and Heyan Huang}, | |
| year={2025}, | |
| eprint={2505.16160}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CL}, | |
| url={https://arxiv.org/abs/2505.16160}, | |
| } | |
| ``` | |