ubowang commited on
Commit
85f5fbc
·
verified ·
1 Parent(s): ef1062b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -44,7 +44,7 @@ Instead of learning from reference answers (as in supervised fine-tuning) or rew
44
  One-shot CFT consistently improves mathematical and logical reasoning.
45
  <strong>Left:</strong> Average accuracy on six mathematical reasoning benchmarks for Qwen and LLaMA models, comparing base, SFT, RLVR, and CFT with only one training example.
46
  <strong>Right:</strong> In-domain accuracy on three logic reasoning benchmarks (BBEH subtasks) for Qwen2.5-Math-7B.
47
- Across both domains, CFT with a single problem significantly outperforms standard supervised fine-tuning and matches or exceeds reinforcement learning with much lower compute.
48
  </em></p>
49
 
50
 
 
44
  One-shot CFT consistently improves mathematical and logical reasoning.
45
  <strong>Left:</strong> Average accuracy on six mathematical reasoning benchmarks for Qwen and LLaMA models, comparing base, SFT, RLVR, and CFT with only one training example.
46
  <strong>Right:</strong> In-domain accuracy on three logic reasoning benchmarks (BBEH subtasks) for Qwen2.5-Math-7B.
47
+ Across both domains, CFT with a single problem significantly outperforms standard SFT and matches or exceeds reinforcement learning with much lower compute.
48
  </em></p>
49
 
50