shenwen-coderV2-F16-GGUF

Hugging Face

GGUF ModelQuantizationFormatLicense

Model Overview

shenwen-coderV2-F16-GGUF is a FP16 GGUF version of shenwen-coderV2-Instruct, converted for use with llama.cpp and compatible tools.

Quantization Details

Attribute Value
Format GGUF
Quantization F16 (Float16)
File Size ~949MB
Original Size ~949MB

Usage with llama.cpp

Prerequisites

# Download llama.cpp CLI tools
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir build && cd build
cmake .. && make -j

Running Inference

# Download model
wget https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF/resolve/main/f16/shenwen-coderV2-F16.gguf

# Run inference
./build/bin/llama-cli -m shenwen-coderV2-F16.gguf -n 512 -p "Write a Python function to calculate factorial:"

Usage with swllm.cpp (Optimized Code Generation)

For optimized code generation, we recommend using our custom swllm.cpp tool:

# Clone swllm.cpp
git clone https://github.com/shenwenAI/swllm.cpp
cd swllm.cpp

# Build
mkdir build && cd build
cmake .. && make -j

# Run with this model
./build/bin/swllm-cli -m shenwen-coderV2-F16.gguf -n 512 -p "Write a Python function to calculate factorial:"

swllm.cpp provides optimized code generation capabilities for enhanced performance and quality.

Model Source

This GGUF model is converted from shenwenAI/shenwen-coderV2-Instruct, which is based on Qwen2.5-Coder-0.5B-Instruct.

License

Apache 2.0 - See LICENSE

Acknowledgments

Connect With Us


If this model is helpful, please consider giving us a star on GitHub and following us on social media!

Downloads last month
617
GGUF
Model size
0.5B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

2-bit

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support