roshniramesh 's Collections int4 llm
updated
Text Generation
• Updated • 34
• 1
nvidia/Gemma-2b-it-ONNX-INT4
nvidia/Meta-Llama-3.1-8B-Instruct-ONNX-INT4
Updated • 22
• 6
nvidia/Meta-Llama-3.2-3B-Instruct-ONNX-INT4
nvidia/Phi-3.5-mini-Instruct-ONNX-INT4
nvidia/Mistral-Nemo-12B-Instruct-ONNX-INT4
nvidia/Nemotron-Mini-4B-Instruct-ONNX-INT4
meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8
Text Generation
• Updated • 91
• 38
hugging-quants/gemma-2-9b-it-AWQ-INT4
Text Generation
• 9B • Updated • 1.89k
• 8
Qwen/Qwen2-7B-Instruct-GPTQ-Int4
Text Generation
• 8B • Updated • 724
• 29
hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4
Text Generation
• Updated • 195k
• 89
RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w4a16
Text Generation
• 8B • Updated • 45.6k
• 30
ModelCloud/Meta-Llama-3.1-8B-gptq-4bit
Text Generation
• 8B • Updated • 106
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation
• 3B • Updated • 24.7k
• 27
hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4
Text Generation
• Updated • 83.8k
• 108
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation
• 1B • Updated • 42.3k
• 19
hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4
Text Generation
• 71B • Updated • 895
• 23
hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4
Text Generation
• 8B • Updated • 8.74k
• 42
meta-llama/Llama-Guard-3-1B-INT4
Text Generation
• Updated • 44
• 27
meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8
Text Generation
• Updated • 55
• 71
meta-llama/Llama-3.2-3B-Instruct-SpinQuant_INT4_EO8
Text Generation
• Updated • 44
• 39
meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8
Text Generation
• Updated • 183
• 47
RedHatAI/Mistral-7B-Instruct-v0.3-GPTQ-4bit
Text Generation
• 7B • Updated • 2.04k
• 23
RedHatAI/Mistral-7B-Instruct-v0.3-quantized.w4a16
Text Generation
• 7B • Updated • 296
• 2
RedHatAI/Llama-2-7b-chat-quantized.w4a16
Text Generation
• 7B • Updated • 121
RedHatAI/Meta-Llama-3-8B-Instruct-quantized.w4a16
Text Generation
• 8B • Updated • 113
• 2
RedHatAI/Meta-Llama-3-70B-Instruct-quantized.w4a16
Text Generation
• 71B • Updated • 13
• 2
RedHatAI/gemma-2-2b-it-quantized.w4a16
Text Generation
• 1B • Updated • 29
• 1
RedHatAI/gemma-2-9b-it-quantized.w4a16
Text Generation
• 3B • Updated • 18
• 2
RedHatAI/Mistral-Nemo-Instruct-2407-quantized.w4a16
Text Generation
• 3B • Updated • 562
• 4
RedHatAI/Meta-Llama-3.1-70B-Instruct-quantized.w4a16
Text Generation
• 71B • Updated • 69.1k
• 32
nvidia/Mistral-7B-Instruct-v0.3-ONNX-INT4
OpenVINO/mistral-7b-instruct-v0.1-int4-ov
Text Generation
• Updated • 17
OpenVINO/Mistral-7B-Instruct-v0.2-int4-ov
Text Generation
• Updated • 789
• 1
Text Generation
• 72B • Updated • 168
• 47
Text Generation
• 14B • Updated • 300
• 100
Text Generation
• 8B • Updated • 1.48k
• 75
Text Generation
• Updated • 351
• 36
Qwen/Qwen1.5-110B-Chat-GPTQ-Int4
Text Generation
• 111B • Updated • 26
• 18
Qwen/Qwen1.5-1.8B-Chat-GPTQ-Int4
Text Generation
• 2B • Updated • 138
• 7
Qwen/Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4
Text Generation
• 14B • Updated • 408
• 50
Qwen/Qwen1.5-4B-Chat-GPTQ-Int4
Text Generation
• 4B • Updated • 40
• 6
Qwen/Qwen1.5-72B-Chat-GPTQ-Int4
Text Generation
• 72B • Updated • 2.64k
• 37
Qwen/Qwen1.5-4B-Chat-GGUF
Text Generation
• 4B • Updated • 670
• 16
Qwen/Qwen1.5-0.5B-Chat-GGUF
Text Generation
• 0.6B • Updated • 4.91k
• 35
Qwen/Qwen1.5-7B-Chat-GGUF
Text Generation
• 8B • Updated • 2.41k
• 70
Qwen/CodeQwen1.5-7B-Chat-GGUF
Text Generation
• 7B • Updated • 1.38k
• 110
Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int4
Text Generation
• 2B • Updated • 3.76k
• 3
Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int4
Text Generation
• 0.5B • Updated • 943
• 9
Qwen/Qwen2.5-0.5B-Instruct-GGUF
Text Generation
• 0.6B • Updated • 64.3k
• 82
Qwen/Qwen2-1.5B-Instruct-GGUF
Text Generation
• 2B • Updated • 18.1k
• 27
Qwen/Qwen2-0.5B-Instruct-GGUF
Text Generation
• 0.5B • Updated • 43.4k
• 71
Qwen/Qwen2-7B-Instruct-GGUF
Text Generation
• 8B • Updated • 11.1k
• 179
Qwen/Qwen2-0.5B-Instruct-GPTQ-Int4
Text Generation
• 0.6B • Updated • 100
• 15
Qwen/Qwen2-1.5B-Instruct-GPTQ-Int4
Text Generation
• 2B • Updated • 20.9k
• 5
Qwen/Qwen2-72B-Instruct-GPTQ-Int4
Text Generation
• 73B • Updated • 182
• 33
Qwen/Qwen2-57B-A14B-Instruct-GPTQ-Int4
Text Generation
• 57B • Updated • 88
• 23