SwastikM
/

Meta-Llama-3-8B-Instruct_bitsandbytes_4bit

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions

SwastikM commited on Jul 23, 2024

Commit

042e8c3

·

verified ·

1 Parent(s): fddfb41

Update README.md

Files changed (1) hide show

README.md +19 -1

README.md CHANGED Viewed

@@ -98,8 +98,26 @@ print(my_list)  # Output: ['a', 'b', 'c']
 Note that in Python, lists are mutable, meaning you can add, remove, or modify elements after creating the list.
 ```
-COMING SOON

 Note that in Python, lists are mutable, meaning you can add, remove, or modify elements after creating the list.
 ```
+## Size Comparison
+The table shows comparison VRAM requirements for loading and training
+of FP16 Base Model and 4bit GPTQ quantized model with PEFT.
+The value for base model referenced from [Model Memory Calculator](https://huggingface.co/docs/accelerate/main/en/usage_guides/model_size_estimator)
+from HuggingFace
+| Model                   | Total Size  |
+| ------------------------|-------------|
+| Base Model              | 28 GB       |
+| 4bitQuantized+PEFT      | 5.21 GB     |
+## Acknowledgment
+Thanks to [@AMerve Noyan](https://huggingface.co/blog/merve/quantization) for precise intro.
+Thanks to [@HuggungFace Team](https://huggingface.co/blog/4bit-transformers-bitsandbytes) for the Blog.
+Thanks to [@Meta](https://huggingface.co/meta-llama) for the Opensourced Model.