Update README.md
Browse files
README.md
CHANGED
|
@@ -4,7 +4,7 @@ tags:
|
|
| 4 |
---
|
| 5 |
|
| 6 |
|
| 7 |
-
Meta-Llama-3-8B-Instruct quantized to FP8 weights and activations using per-tensor quantization, ready for inference with vLLM >= 0.4.
|
| 8 |
|
| 9 |
Produced using https://github.com/neuralmagic/AutoFP8/blob/b0c1f789c51659bb023c06521ecbd04cea4a26f6/quantize.py
|
| 10 |
|
|
|
|
| 4 |
---
|
| 5 |
|
| 6 |
|
| 7 |
+
Meta-Llama-3-8B-Instruct quantized to FP8 weights and activations using per-tensor quantization, ready for inference with vLLM >= 0.4.3.
|
| 8 |
|
| 9 |
Produced using https://github.com/neuralmagic/AutoFP8/blob/b0c1f789c51659bb023c06521ecbd04cea4a26f6/quantize.py
|
| 10 |
|