NeuroSenko commited on
Commit
ec0e458
Β·
verified Β·
1 Parent(s): 75c05d7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -28,9 +28,9 @@ PPL and KL divergence metrics are non-computable for this model due to inf/NaN v
28
  | original | 214.36 | 8.00 | β€” | β€” | β€” | β€” | β€” |
29
 
30
  <details>
31
- <summary>Quantization Notes β€” inf/NaN during inference</summary>
32
 
33
- ### Inf/NaN values in calibration data
34
 
35
  Some experts in the model produce `inf` values during calibration (e.g. experts 61 and 74 in the last layer had inf values in their down-projection calibration state). The `lm_head` layer also exhibited NaN values in its calibration state (445K NaN out of 1.5B elements).
36
 
 
28
  | original | 214.36 | 8.00 | β€” | β€” | β€” | β€” | β€” |
29
 
30
  <details>
31
+ <summary>Quantization Notes</summary>
32
 
33
+ ### Inf/NaN values during calibration
34
 
35
  Some experts in the model produce `inf` values during calibration (e.g. experts 61 and 74 in the last layer had inf values in their down-projection calibration state). The `lm_head` layer also exhibited NaN values in its calibration state (445K NaN out of 1.5B elements).
36