shc2012

Update README with swllm.cpp usage and social links

22939f8 verified 9 days ago

3.64 kB

AIGC:
  ContentProducer: Minimax Agent AI
  ContentPropagator: Minimax Agent AI
  Label: AIGC
  ProduceID: f3e961de220519135b7936401f9c497b
  PropagateID: f3e961de220519135b7936401f9c497b
  ReservedCode1: >-
    30450221008b926720cc537a337609a6396807cefd6f2465e1a733f88cb72655e7ed3b5a1e0220073082e844d423175f71300fa33a443d56620f52022574850f68f6c58be981c9
  ReservedCode2: >-
    3045022100cee9a5ea6ceee0d1355538f5b52d08108adca91f6b0bd514a775e3cd43616f5e02200b1208fe8656e20f91c6bf8f9d6f4e07d3780abe35035a516e3fe4ffb4de7e6a

shenwen-coderV2-Instruct

Model Overview

shenwen-coderV2-Instruct is an instruction-tuned code generation model based on Qwen2.5-Coder-0.5B-Instruct, optimized for various code generation tasks.

Model Details

Base Model: Qwen2.5-Coder-0.5B-Instruct
Tensor Type: BF16
Parameters: 0.5B
Architecture: qwen2

Usage

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "shenwenAI/shenwen-coderV2-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = "Write a Python function to calculate factorial:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Using vLLM

from vllm import LLM, SamplingParams

llm = LLM(model="shenwenAI/shenwen-coderV2-Instruct")
sampling_params = SamplingParams(temperature=0.8, top_p=0.95, max_tokens=512)

prompts = ["Write a Python function to calculate factorial:"]
outputs = llm.generate(prompts, sampling_params)
print(outputs[0].outputs[0].text)

Usage with swllm.cpp (Optimized Code Generation)

For optimized code generation, we recommend using our custom swllm.cpp tool:

# Clone swllm.cpp
git clone https://github.com/shenwenAI/swllm.cpp
cd swllm.cpp

# Build with this model
# Convert model to GGUF format first if needed

# Run inference
./build/bin/swllm-cli -m path/to/model.gguf -n 512 -p "Write a Python function to calculate factorial:"

swllm.cpp provides optimized code generation capabilities for enhanced performance and quality.

Quantization

For quantized versions, please visit: shenwenAI/shenwen-coderV2-GGUF

Quantization	Size
Q2_K	339 MB
Q4_K_M	398 MB
Q5_K_M	420 MB
Q8_0	531 MB
F16	994 MB

License

Apache 2.0 - See LICENSE

Acknowledgments

Qwen Team for Qwen2.5-Coder
shenwenAI for model training and optimization

Connect With Us

GitHub: https://github.com/shenwenAI
HuggingFace: https://huggingface.co/shenwenAI
Twitter/X: https://x.com/shenwenai

If this model is helpful, please consider giving us a star on GitHub and following us on social media!