shenwen-coderV2-F16-GGUF

Model Overview

shenwen-coderV2-F16-GGUF is a FP16 GGUF version of shenwen-coderV2-Instruct, converted for use with llama.cpp and compatible tools.

Quantization Details

Attribute	Value
Format	GGUF
Quantization	F16 (Float16)
File Size	~949MB
Original Size	~949MB

Usage with llama.cpp

Prerequisites

# Download llama.cpp CLI tools
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir build && cd build
cmake .. && make -j

Running Inference

# Download model
wget https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF/resolve/main/f16/shenwen-coderV2-F16.gguf

# Run inference
./build/bin/llama-cli -m shenwen-coderV2-F16.gguf -n 512 -p "Write a Python function to calculate factorial:"

Usage with swllm.cpp (Optimized Code Generation)

For optimized code generation, we recommend using our custom swllm.cpp tool:

# Clone swllm.cpp
git clone https://github.com/shenwenAI/swllm.cpp
cd swllm.cpp

# Build
mkdir build && cd build
cmake .. && make -j

# Run with this model
./build/bin/swllm-cli -m shenwen-coderV2-F16.gguf -n 512 -p "Write a Python function to calculate factorial:"

swllm.cpp provides optimized code generation capabilities for enhanced performance and quality.

Model Source

This GGUF model is converted from shenwenAI/shenwen-coderV2-Instruct, which is based on Qwen2.5-Coder-0.5B-Instruct.

License

Apache 2.0 - See LICENSE

Acknowledgments

Qwen Team for Qwen2.5-Coder
llama.cpp for GGUF format
shenwenAI for model training

Connect With Us

GitHub: https://github.com/shenwenAI
HuggingFace: https://huggingface.co/shenwenAI
Twitter/X: https://x.com/shenwenai

If this model is helpful, please consider giving us a star on GitHub and following us on social media!

Downloads last month: 617

GGUF

Model size

0.5B params

Architecture

qwen2

Hardware compatibility

2-bit

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support