shenwen-coderV2-F16-GGUF
Model Overview
shenwen-coderV2-F16-GGUF is a FP16 GGUF version of shenwen-coderV2-Instruct, converted for use with llama.cpp and compatible tools.
Quantization Details
| Attribute | Value |
|---|---|
| Format | GGUF |
| Quantization | F16 (Float16) |
| File Size | ~949MB |
| Original Size | ~949MB |
Usage with llama.cpp
Prerequisites
# Download llama.cpp CLI tools
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir build && cd build
cmake .. && make -j
Running Inference
# Download model
wget https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF/resolve/main/f16/shenwen-coderV2-F16.gguf
# Run inference
./build/bin/llama-cli -m shenwen-coderV2-F16.gguf -n 512 -p "Write a Python function to calculate factorial:"
Usage with swllm.cpp (Optimized Code Generation)
For optimized code generation, we recommend using our custom swllm.cpp tool:
# Clone swllm.cpp
git clone https://github.com/shenwenAI/swllm.cpp
cd swllm.cpp
# Build
mkdir build && cd build
cmake .. && make -j
# Run with this model
./build/bin/swllm-cli -m shenwen-coderV2-F16.gguf -n 512 -p "Write a Python function to calculate factorial:"
swllm.cpp provides optimized code generation capabilities for enhanced performance and quality.
Model Source
This GGUF model is converted from shenwenAI/shenwen-coderV2-Instruct, which is based on Qwen2.5-Coder-0.5B-Instruct.
License
Apache 2.0 - See LICENSE
Acknowledgments
Connect With Us
- GitHub: https://github.com/shenwenAI
- HuggingFace: https://huggingface.co/shenwenAI
- Twitter/X: https://x.com/shenwenai
If this model is helpful, please consider giving us a star on GitHub and following us on social media!
- Downloads last month
- 617
Hardware compatibility
Log In to add your hardware
2-bit
4-bit
5-bit
8-bit
16-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support