yuzhe commited on
Commit
9bf70cb
·
verified ·
1 Parent(s): b8ae9a2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -52,8 +52,8 @@ $$
52
 
53
  Where:
54
  - $\theta_s$ and $\theta_t$ represent student (trainable) and teacher (frozen) model parameters
55
- - $$P_{\theta}^{(i)}$$ denotes the probability distribution at reasoning step $$i$$
56
- - $$ \lambda(t) = \lambda_0 \cdot (1 + \gamma \cdot \text{complexity}(x_t)) $$ is the dynamic weight function
57
  - $\alpha_i = \exp(-\delta \cdot i/T)$ implements exponential decay for later reasoning steps
58
  - $\mathcal{L}_{\text{QS}}$ is the quality scoring loss ensuring reasoning coherence
59
 
 
52
 
53
  Where:
54
  - $\theta_s$ and $\theta_t$ represent student (trainable) and teacher (frozen) model parameters
55
+ - $P_{\theta}^{(i)}$ denotes the probability distribution at reasoning step $i$
56
+ - $\lambda(t) = \lambda_0 \cdot (1 + \gamma \cdot \text{complexity}(x_t))$ is the dynamic weight function
57
  - $\alpha_i = \exp(-\delta \cdot i/T)$ implements exponential decay for later reasoning steps
58
  - $\mathcal{L}_{\text{QS}}$ is the quality scoring loss ensuring reasoning coherence
59