ISTA-DASLab
/

Meta-Llama-3-70B-Instruct-AQLM-2Bit-1x16

Text Generation

text-generation-inference

Model card Files Files and versions

SpiridonSunRotator commited on May 3, 2024

Commit

9df2164

·

verified ·

1 Parent(s): 75d8ed3

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -8,13 +8,13 @@ tags:
 - conversational
 - text-generation-inference
 ---
-Official [AQLM](https://arxiv.org/abs/2401.06118) quantization of [meta-llama/Meta-Llama-3-70B
-](https://huggingface.co/meta-llama/Meta-Llama-3-70B).
 For this quantization, we used 1 codebook of 16 bits.
 Results (in progress):
 | Model      | Quantization | Model size, Gb |
 |------|------|------|
-|meta-llama/Meta-Llama-3-70B | - | 141.2 |
 |  | 1x16 |  21.9 |

 - conversational
 - text-generation-inference
 ---
+Official [AQLM](https://arxiv.org/abs/2401.06118) quantization of [meta-llama/Meta-Llama-3-70B-Instruct
+](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct).
 For this quantization, we used 1 codebook of 16 bits.
 Results (in progress):
 | Model      | Quantization | Model size, Gb |
 |------|------|------|
+|meta-llama/Meta-Llama-3-70B-Instruct | - | 141.2 |
 |  | 1x16 |  21.9 |