roshniramesh 's Collections int4 llm
updated
Text Generation
• Updated • 11
• 1
nvidia/Gemma-2b-it-ONNX-INT4
nvidia/Meta-Llama-3.1-8B-Instruct-ONNX-INT4
Updated • 14
• 8
nvidia/Meta-Llama-3.2-3B-Instruct-ONNX-INT4
nvidia/Phi-3.5-mini-Instruct-ONNX-INT4
nvidia/Mistral-Nemo-12B-Instruct-ONNX-INT4
nvidia/Nemotron-Mini-4B-Instruct-ONNX-INT4
meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8
Text Generation
• Updated • 27
• 39
hugging-quants/gemma-2-9b-it-AWQ-INT4
Text Generation
• 9B • Updated • 9.11k
• 9
Qwen/Qwen2-7B-Instruct-GPTQ-Int4
Text Generation
• 8B • Updated • 687
• 28
hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4
Text Generation
• Updated • 183k
• 91
RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w4a16
Text Generation
• 8B • Updated • 74.9k
• 30
ModelCloud/Meta-Llama-3.1-8B-gptq-4bit
Text Generation
• 8B • Updated • 206
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation
• 3B • Updated • 33.6k
• 31
hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4
Text Generation
• Updated • 386k
• 109
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation
• 1B • Updated • 42.5k
• 26
hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4
Text Generation
• 71B • Updated • 1.39k
• 23
hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4
Text Generation
• 8B • Updated • 4.19k
• 42
meta-llama/Llama-Guard-3-1B-INT4
Text Generation
• Updated • 13
• 29
meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8
Text Generation
• Updated • 74
• 74
meta-llama/Llama-3.2-3B-Instruct-SpinQuant_INT4_EO8
Text Generation
• Updated • 12
• 40
meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8
Text Generation
• Updated • 51
• 48
RedHatAI/Mistral-7B-Instruct-v0.3-GPTQ-4bit
Text Generation
• 7B • Updated • 1.16k
• 25
RedHatAI/Mistral-7B-Instruct-v0.3-quantized.w4a16
Text Generation
• 7B • Updated • 460
• 2
RedHatAI/Llama-2-7b-chat-quantized.w4a16
Text Generation
• 7B • Updated • 7
RedHatAI/Meta-Llama-3-8B-Instruct-quantized.w4a16
Text Generation
• 8B • Updated • 1.05k
• 2
RedHatAI/Meta-Llama-3-70B-Instruct-quantized.w4a16
Text Generation
• 71B • Updated • 16
• 2
RedHatAI/gemma-2-2b-it-quantized.w4a16
Text Generation
• 3B • Updated • 92
• 1
RedHatAI/gemma-2-9b-it-quantized.w4a16
Text Generation
• 10B • Updated • 125
• 2
RedHatAI/Mistral-Nemo-Instruct-2407-quantized.w4a16
Text Generation
• 12B • Updated • 364
• 4
RedHatAI/Meta-Llama-3.1-70B-Instruct-quantized.w4a16
Text Generation
• 71B • Updated • 77.1k
• 33
nvidia/Mistral-7B-Instruct-v0.3-ONNX-INT4
OpenVINO/mistral-7b-instruct-v0.1-int4-ov
Text Generation
• Updated • 40
OpenVINO/Mistral-7B-Instruct-v0.2-int4-ov
Text Generation
• Updated • 76
• 1
Text Generation
• 72B • Updated • 69
• 47
Text Generation
• 14B • Updated • 738
• 100
Text Generation
• 8B • Updated • 586
• 75
Text Generation
• 2B • Updated • 219
• 36
Qwen/Qwen1.5-110B-Chat-GPTQ-Int4
Text Generation
• 111B • Updated • 125
• 18
Qwen/Qwen1.5-1.8B-Chat-GPTQ-Int4
Text Generation
• 2B • Updated • 130
• 7
Qwen/Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4
Text Generation
• 14B • Updated • 3.88k
• 50
Qwen/Qwen1.5-4B-Chat-GPTQ-Int4
Text Generation
• 4B • Updated • 12
• 6
Qwen/Qwen1.5-72B-Chat-GPTQ-Int4
Text Generation
• 72B • Updated • 4.05k
• 37
Qwen/Qwen1.5-4B-Chat-GGUF
Text Generation
• 4B • Updated • 986
• 16
Qwen/Qwen1.5-0.5B-Chat-GGUF
Text Generation
• 0.6B • Updated • 9.97k
• 35
Qwen/Qwen1.5-7B-Chat-GGUF
Text Generation
• 8B • Updated • 1.01k
• 71
Qwen/CodeQwen1.5-7B-Chat-GGUF
Text Generation
• 7B • Updated • 703
• 111
Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int4
Text Generation
• 2B • Updated • 1.48k
• 3
Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int4
Text Generation
• 0.5B • Updated • 2.89k
• 9
Qwen/Qwen2.5-0.5B-Instruct-GGUF
Text Generation
• 0.6B • Updated • 199k
• 107
Qwen/Qwen2-1.5B-Instruct-GGUF
Text Generation
• 2B • Updated • 49.5k
• 29
Qwen/Qwen2-0.5B-Instruct-GGUF
Text Generation
• 0.5B • Updated • 10.4k
• 73
Qwen/Qwen2-7B-Instruct-GGUF
Text Generation
• 8B • Updated • 71.9k
• 179
Qwen/Qwen2-0.5B-Instruct-GPTQ-Int4
Text Generation
• 0.6B • Updated • 79
• 15
Qwen/Qwen2-1.5B-Instruct-GPTQ-Int4
Text Generation
• 2B • Updated • 39.5k
• 5
Qwen/Qwen2-72B-Instruct-GPTQ-Int4
Text Generation
• 73B • Updated • 68
• 33
Qwen/Qwen2-57B-A14B-Instruct-GPTQ-Int4
Text Generation
• 57B • Updated • 73
• 23