INC4AI commited on
Commit
5ca68b5
·
verified ·
1 Parent(s): f9f1370

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -3
README.md CHANGED
@@ -1,3 +1,47 @@
1
- ---
2
- license: llama3.1
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama3.1
3
+ base_model:
4
+ - meta-llama/Llama-3.1-70B-Instruct
5
+ ---
6
+
7
+ ## Model Details
8
+
9
+ This model card is for mxfp8/nvfp4 quantization of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) based on [intel/auto-round](https://github.com/intel/auto-round).
10
+ The models are not able to be published due to license limitation. Please follow the INC example README to generate and evaluate the low precision models.
11
+
12
+ ## How to Use
13
+
14
+ The step-by-step README of quantization and evaluation can be found in [Intel Neural Compressor Examples](https://github.com/intel/neural-compressor/blob/master/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/llama3/README.md).
15
+
16
+ ## Evaluate Results
17
+
18
+ | Task | backend | BF16 | MXFP8 | NVFP4 |
19
+ |:-------------------:|:-------:|:------:|:------:|:------:|
20
+ | hellaswag | vllm | 0.6609 | 0.6612 | 0.6547 |
21
+ | piqa | vllm | 0.8357 | 0.8379 | 0.8303 |
22
+ | mmlu_llama | vllm | 0.8388 | 0.8367 | 0.8311 |
23
+ | gsm8k_llama(strict) | vllm | 0.9522 | 0.9500 | 0.9401 |
24
+ | average | vllm | 0.8219 | 0.8215 | 0.8141 |
25
+
26
+
27
+ ## Ethical Considerations and Limitations
28
+
29
+ The model can produce factually incorrect output, and should not be relied on to produce factually accurate information.
30
+ Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs.
31
+
32
+ Therefore, before deploying any applications of the model, developers should perform safety testing.
33
+
34
+ ## Caveats and Recommendations
35
+
36
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
37
+
38
+ Here are a couple of useful links to learn more about Intel's AI software:
39
+
40
+ - [Intel Neural Compressor](https://github.com/intel/neural-compressor)
41
+ - [AutoRound](https://github.com/intel/auto-round)
42
+
43
+ ## Disclaimer
44
+
45
+ The license on this model does not constitute legal advice.
46
+ We are not responsible for the actions of third parties who use this model.
47
+ Please consult an attorney before using this model for commercial purposes.