Update README.md
Browse files
README.md
CHANGED
|
@@ -26,7 +26,7 @@ FP8 quantization of [allenai/SERA-32B-GA](https://huggingface.co/allenai/SERA-32
|
|
| 26 |
| Method | FP8 (W8A8) via `llmcompressor` `oneshot` |
|
| 27 |
| Targets | All `Linear` layers except `lm_head` |
|
| 28 |
| Calibration dataset | `allenai/Sera-4.5A-Lite-T2` |
|
| 29 |
-
| Calibration samples |
|
| 30 |
| Calibration sequence length | 2048 tokens |
|
| 31 |
| llmcompressor version | 0.9.0.2 |
|
| 32 |
| Hardware | AWS g6e.4xlarge (NVIDIA L40S, 48 GB VRAM) |
|
|
|
|
| 26 |
| Method | FP8 (W8A8) via `llmcompressor` `oneshot` |
|
| 27 |
| Targets | All `Linear` layers except `lm_head` |
|
| 28 |
| Calibration dataset | `allenai/Sera-4.5A-Lite-T2` |
|
| 29 |
+
| Calibration samples | 512 |
|
| 30 |
| Calibration sequence length | 2048 tokens |
|
| 31 |
| llmcompressor version | 0.9.0.2 |
|
| 32 |
| Hardware | AWS g6e.4xlarge (NVIDIA L40S, 48 GB VRAM) |
|