mgoin
/

Meta-Llama-3-8B-Instruct-pruned50-quant-ds

Text Generation

Model card Files Files and versions

mgoin commited on Jun 28, 2024

Commit

5370352

·

verified ·

1 Parent(s): 667be31

Update README.md

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -1,3 +1,10 @@
 Llama 3 8B Instruct that has been compressed in one-shot to 50% sparsity and INT8 weights+activations using SparseGPT, SmoothQuant, and GPTQ.
 Made with SparseML+DeepSparse=1.7. Install with `pip install deepsparse~=1.7 "sparseml[transformers]"~=1.7 "numpy<2"`.

+---
+base_model: meta-llama/Meta-Llama-3-8B-Instruct
+inference: false
+tags:
+- deepsparse
+---
 Llama 3 8B Instruct that has been compressed in one-shot to 50% sparsity and INT8 weights+activations using SparseGPT, SmoothQuant, and GPTQ.
 Made with SparseML+DeepSparse=1.7. Install with `pip install deepsparse~=1.7 "sparseml[transformers]"~=1.7 "numpy<2"`.