nenad1002 commited on
Commit
7230dcd
·
verified ·
1 Parent(s): dcc76e2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -53,13 +53,13 @@ The table above provides a high-level summary of observed accuracy deltas across
53
 
54
  ### Generation Stability (EOS Behavior)
55
 
56
- | Model | Premature EOS Rate |
57
  |------|--------------------|
58
  | Torch baseline | 6% |
59
  | Previous INT4 GPU ONNX model | 52% |
60
  | Updated QAT INT4 GPU ONNX model | **11%** |
61
 
62
- The updated model reduces premature EOS generation by approximately **5×** compared to the previous INT4 GPU ONNX release, resulting in more stable and complete generations while remaining close to Torch baseline behavior.
63
 
64
  ## Hardware Supported
65
  The ONNX models are tested on:
 
53
 
54
  ### Generation Stability (EOS Behavior)
55
 
56
+ | Model | EOS Non-Emission Rate |
57
  |------|--------------------|
58
  | Torch baseline | 6% |
59
  | Previous INT4 GPU ONNX model | 52% |
60
  | Updated QAT INT4 GPU ONNX model | **11%** |
61
 
62
+ The updated model reduces EOS non-emission by approximately 5× compared to the previous INT4 GPU ONNX release, as observed across a large set of randomly generated prompts, resulting in more reliable sequence termination and generation behavior closer to the Torch baseline.
63
 
64
  ## Hardware Supported
65
  The ONNX models are tested on: