Volko
Volko76
AI & ML interests
Quantization, Fine-tune, Agentic Frameworks
Recent Activity
liked a model about 3 hours ago
ssweens/deepseek-ai__DeepSeek-V4-Flash-GGUF-YMMV new activity about 3 hours ago
ssweens/deepseek-ai__DeepSeek-V4-Flash-GGUF-YMMV:I would love to see the iq2xxs new activity 6 days ago
inferencerlabs/MiMo-V2.5-Pro-MLX-4.3bit-INF:128gb seems suspiciouly lowOrganizations
Qwen2.5 Coder Base GGUF
A list of Qwen2.5 Coder base quantized in GGUF
GGUF Quantizations
A CPU + GPU support type of quantization. It's currently the most used quantization method. Read more here : https://github.com/ggerganov/llama.cpp
-
Volko76/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 60 -
Volko76/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 72 -
Volko76/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 61 -
Volko76/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 69
Qwen2.5 Coder Instruct GGUF
A list of Qwen2.5 Coder quantized in GGUF
-
Volko76/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 60 -
Volko76/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 72 -
Volko76/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 61 -
Volko76/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 69
OpenCoder GGUF
A complete open source small coding model quantized in GGUF
EXL2 Quantizations
A collection of models quantized for EXL2, one of the fastest quantisation method. https://github.com/turboderp/exllamav2
-
Volko76/Qwen2.5-Coder-0.5B-Instruct-1.0bpw-exl2
Text Generation • Updated -
Volko76/Qwen2.5-Coder-0.5B-Instruct-2.0bpw-exl2
Text Generation • Updated -
Volko76/Qwen2.5-Coder-0.5B-Instruct-3.0bpw-exl2
Text Generation • Updated -
Volko76/Qwen2.5-Coder-0.5B-Instruct-4.5bpw-exl2
Text Generation • Updated
EXL3 Quantizations
Qwen2.5 Coder Instruct GGUF
A list of Qwen2.5 Coder quantized in GGUF
-
Volko76/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 60 -
Volko76/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 72 -
Volko76/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 61 -
Volko76/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 69
Qwen2.5 Coder Base GGUF
A list of Qwen2.5 Coder base quantized in GGUF
OpenCoder GGUF
A complete open source small coding model quantized in GGUF
GGUF Quantizations
A CPU + GPU support type of quantization. It's currently the most used quantization method. Read more here : https://github.com/ggerganov/llama.cpp
-
Volko76/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 60 -
Volko76/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 72 -
Volko76/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 61 -
Volko76/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 69
EXL2 Quantizations
A collection of models quantized for EXL2, one of the fastest quantisation method. https://github.com/turboderp/exllamav2
-
Volko76/Qwen2.5-Coder-0.5B-Instruct-1.0bpw-exl2
Text Generation • Updated -
Volko76/Qwen2.5-Coder-0.5B-Instruct-2.0bpw-exl2
Text Generation • Updated -
Volko76/Qwen2.5-Coder-0.5B-Instruct-3.0bpw-exl2
Text Generation • Updated -
Volko76/Qwen2.5-Coder-0.5B-Instruct-4.5bpw-exl2
Text Generation • Updated