inference-optimization
/

Meta-Llama-3-8B-Instruct-NVFP4-GPTQ-Quant

8-bit precision

compressed-tensors

Model card Files Files and versions

Meta-Llama-3-8B-Instruct-NVFP4-GPTQ-Quant

6.05 GB

Ctrl+K

Ctrl+K

1 contributor

History: 2 commits

dsikka's picture

Upload folder using huggingface_hub

d274493 verified 4 months ago

.gitattributes

1.57 kB
Upload folder using huggingface_hub 4 months ago
chat_template.jinja

389 Bytes
Upload folder using huggingface_hub 4 months ago
config.json

2.07 kB
Upload folder using huggingface_hub 4 months ago
generation_config.json

194 Bytes
Upload folder using huggingface_hub 4 months ago
model-00001-of-00002.safetensors

4.98 GB
xet

Upload folder using huggingface_hub 4 months ago
model-00002-of-00002.safetensors

1.05 GB
xet

Upload folder using huggingface_hub 4 months ago
model.safetensors.index.json

87 kB
Upload folder using huggingface_hub 4 months ago
recipe.yaml

1.11 kB
Upload folder using huggingface_hub 4 months ago
special_tokens_map.json

296 Bytes
Upload folder using huggingface_hub 4 months ago
tokenizer.json

17.2 MB
xet

Upload folder using huggingface_hub 4 months ago
tokenizer_config.json

50.6 kB
Upload folder using huggingface_hub 4 months ago