roshniramesh 's Collections int4 llm
updated
Text Generation
• Updated • 50
• 1
nvidia/Gemma-2b-it-ONNX-INT4
nvidia/Meta-Llama-3.1-8B-Instruct-ONNX-INT4
Updated • 27
• 7
nvidia/Meta-Llama-3.2-3B-Instruct-ONNX-INT4
nvidia/Phi-3.5-mini-Instruct-ONNX-INT4
nvidia/Mistral-Nemo-12B-Instruct-ONNX-INT4
nvidia/Nemotron-Mini-4B-Instruct-ONNX-INT4
meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8
Text Generation
• Updated • 89
• 38
hugging-quants/gemma-2-9b-it-AWQ-INT4
Text Generation
• 9B • Updated • 2.3k
• 8
Qwen/Qwen2-7B-Instruct-GPTQ-Int4
Text Generation
• 8B • Updated • 1.53k
• 28
hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4
Text Generation
• Updated • 529k
• 89
RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w4a16
Text Generation
• 8B • Updated • 41.7k
• 30
ModelCloud/Meta-Llama-3.1-8B-gptq-4bit
Text Generation
• 8B • Updated • 37
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation
• 3B • Updated • 26.1k
• 27
hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4
Text Generation
• Updated • 212k
• 109
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation
• 1B • Updated • 40.9k
• 20
hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4
Text Generation
• 71B • Updated • 5.58k
• 23
hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4
Text Generation
• 8B • Updated • 12.4k
• 42
meta-llama/Llama-Guard-3-1B-INT4
Text Generation
• Updated • 24
• 27
meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8
Text Generation
• Updated • 125
• 72
meta-llama/Llama-3.2-3B-Instruct-SpinQuant_INT4_EO8
Text Generation
• Updated • 49
• 39
meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8
Text Generation
• Updated • 66
• 47
RedHatAI/Mistral-7B-Instruct-v0.3-GPTQ-4bit
Text Generation
• 7B • Updated • 3.25k
• 25
RedHatAI/Mistral-7B-Instruct-v0.3-quantized.w4a16
Text Generation
• 7B • Updated • 134
• 2
RedHatAI/Llama-2-7b-chat-quantized.w4a16
Text Generation
• 7B • Updated • 64
RedHatAI/Meta-Llama-3-8B-Instruct-quantized.w4a16
Text Generation
• 8B • Updated • 116
• 2
RedHatAI/Meta-Llama-3-70B-Instruct-quantized.w4a16
Text Generation
• 71B • Updated • 157
• 2
RedHatAI/gemma-2-2b-it-quantized.w4a16
Text Generation
• 1B • Updated • 111
• 1
RedHatAI/gemma-2-9b-it-quantized.w4a16
Text Generation
• 3B • Updated • 258
• 2
RedHatAI/Mistral-Nemo-Instruct-2407-quantized.w4a16
Text Generation
• 3B • Updated • 1.71k
• 4
RedHatAI/Meta-Llama-3.1-70B-Instruct-quantized.w4a16
Text Generation
• 71B • Updated • 145k
• 32
nvidia/Mistral-7B-Instruct-v0.3-ONNX-INT4
OpenVINO/mistral-7b-instruct-v0.1-int4-ov
Text Generation
• Updated • 21
OpenVINO/Mistral-7B-Instruct-v0.2-int4-ov
Text Generation
• Updated • 525
• 1
Text Generation
• 72B • Updated • 1.62k
• 47
Text Generation
• 14B • Updated • 145
• 100
Text Generation
• 8B • Updated • 823
• 75
Text Generation
• Updated • 305
• 36
Qwen/Qwen1.5-110B-Chat-GPTQ-Int4
Text Generation
• 111B • Updated • 54
• 18
Qwen/Qwen1.5-1.8B-Chat-GPTQ-Int4
Text Generation
• 2B • Updated • 152
• 7
Qwen/Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4
Text Generation
• 14B • Updated • 7.62k
• 50
Qwen/Qwen1.5-4B-Chat-GPTQ-Int4
Text Generation
• 4B • Updated • 92
• 6
Qwen/Qwen1.5-72B-Chat-GPTQ-Int4
Text Generation
• 72B • Updated • 3.22k
• 37
Qwen/Qwen1.5-4B-Chat-GGUF
Text Generation
• 4B • Updated • 848
• 16
Qwen/Qwen1.5-0.5B-Chat-GGUF
Text Generation
• 0.6B • Updated • 7.4k
• 35
Qwen/Qwen1.5-7B-Chat-GGUF
Text Generation
• 8B • Updated • 945
• 70
Qwen/CodeQwen1.5-7B-Chat-GGUF
Text Generation
• 7B • Updated • 995
• 110
Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int4
Text Generation
• 2B • Updated • 1.23k
• 3
Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int4
Text Generation
• 0.5B • Updated • 2.47k
• 9
Qwen/Qwen2.5-0.5B-Instruct-GGUF
Text Generation
• 0.6B • Updated • 89.8k
• 93
Qwen/Qwen2-1.5B-Instruct-GGUF
Text Generation
• 2B • Updated • 24.1k
• 29
Qwen/Qwen2-0.5B-Instruct-GGUF
Text Generation
• 0.5B • Updated • 48.5k
• 72
Qwen/Qwen2-7B-Instruct-GGUF
Text Generation
• 8B • Updated • 9.69k
• 179
Qwen/Qwen2-0.5B-Instruct-GPTQ-Int4
Text Generation
• 0.6B • Updated • 184
• 15
Qwen/Qwen2-1.5B-Instruct-GPTQ-Int4
Text Generation
• 2B • Updated • 51k
• 5
Qwen/Qwen2-72B-Instruct-GPTQ-Int4
Text Generation
• 73B • Updated • 196
• 33
Qwen/Qwen2-57B-A14B-Instruct-GPTQ-Int4
Text Generation
• 57B • Updated • 400
• 23