luguoshan commited on
Commit
3a5e99b
1 Parent(s): 8a40c50

update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -36,13 +36,15 @@ _Evaluated on 12 diverse benchmarks covering knowledge, reasoning, coding, and m
36
  ### Technical Overview
37
  The training objective combines two complementary losses:
38
 
39
- $ \mathcal{L}(\theta) = \mathcal{L}_{\text{SFT}}(\theta) + \lambda \mathcal{L}_{\text{conf}}(\theta) $
 
 
40
 
41
  Where:
42
 
43
- + $ \mathcal{L}_{\text{SFT}} $: Supervised fine-tuning loss ensuring prediction correctness
44
- + $ \mathcal{L}_{\text{conf}} $: Confidence loss that minimizes entropy only for correctly predicted tokens
45
- + $ \lambda $: Hyperparameter balancing the two objectives
46
 
47
  ### Why CAP Works
48
  1. **Sharpens Correct Predictions**: While standard training ensures correctness, it provides diminishing incentive to increase confidence on already-correct tokens. CAP explicitly optimizes for high-confidence predictions.
 
36
  ### Technical Overview
37
  The training objective combines two complementary losses:
38
 
39
+ ```math
40
+ L(胃) = L_SFT(胃) + 位L_conf(胃)
41
+ ```
42
 
43
  Where:
44
 
45
+ + **L_SFT**: Supervised fine-tuning loss ensuring prediction correctness
46
+ + **L_conf**: Confidence loss that minimizes entropy only for correctly predicted tokens
47
+ + **位**: Hyperparameter balancing the two objectives
48
 
49
  ### Why CAP Works
50
  1. **Sharpens Correct Predictions**: While standard training ensures correctness, it provides diminishing incentive to increase confidence on already-correct tokens. CAP explicitly optimizes for high-confidence predictions.