Yuqian-Fu commited on
Commit
30d6824
·
verified ·
1 Parent(s): cb40792

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -3
README.md CHANGED
@@ -1,3 +1,17 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - Elliott/Openr1-Math-46k-8192
5
+ base_model:
6
+ - open-r1/Qwen2.5-Math-7B-RoPE-300k
7
+ - Qwen/Qwen2.5-Math-7B
8
+ pipeline_tag: reinforcement-learning
9
+ ---
10
+
11
+ # 📄 Introduction
12
+
13
+ Supervised Reinforcement Fine-Tuning (SRFT) is a single-stage method that unifies both fine-tuning paradigms through entropy-aware weighting mechanisms.
14
+
15
+ Paper: [arXiv](https://arxiv.org/abs/2506.19767)
16
+
17
+ Project Website: [SRFT](https://anonymous.4open.science/w/SRFT2025)