shc2012 commited on
Commit
22939f8
·
verified ·
1 Parent(s): 099d52a

Update README with swllm.cpp usage and social links

Browse files
Files changed (1) hide show
  1. README.md +51 -234
README.md CHANGED
@@ -3,287 +3,104 @@ AIGC:
3
  ContentProducer: Minimax Agent AI
4
  ContentPropagator: Minimax Agent AI
5
  Label: AIGC
6
- ProduceID: b4bb3d57bc0ce8c354e4bcb050972fe0
7
- PropagateID: b4bb3d57bc0ce8c354e4bcb050972fe0
8
- ReservedCode1: 304402200e99b598a461cba050f038aaded8aa408584562e393afc88fa56d87d1c4bb8e702204cc29b8225838f04ab15d48aec68760a1d3a792627f04a1bb9e4f7f8a8d9162c
9
- ReservedCode2: 3045022100d0dcfefede67eb43affd01eb6a0cb9f3b5f18b6a62ced359b5aea8c8e25fcfd60220620b06dda8454d7580fe66bd02ed1194bbe9b88b34cd19d325fc83866a61b600
10
  ---
11
 
12
  # shenwen-coderV2-Instruct
13
 
14
- <p align="center">
15
- <img src="https://huggingface.co/front/assets/huggingface_logo.svg" alt="Hugging Face" width="50" height="50">
16
- </p>
17
 
18
- <div align="center">
19
 
20
- [![Model](https://img.shields.io/badge/Model-shenwen--coderV2--Instruct-blue.svg)](https://huggingface.co/shenwenAI/shenwen-coderV2-Instruct)
21
- [![Base Model](https://img.shields.io/badge/Base%20Model-Qwen2.5--Coder--0.5B-orange.svg)](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct)
22
- [![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](LICENSE)
23
- [![Parameters](https://img.shields.io/badge/Parameters-0.5B-yellow.svg)]()
24
 
25
- </div>
26
 
27
- ## Overview
28
 
29
- **shenwen-coderV2-Instruct** is an instruction-tuned code generation model built upon [Qwen2.5-Coder-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct), further enhanced with high-quality code data inspired by [Zeta](https://zed.dev/blog/zeta2) training methodology. This model is designed to provide efficient and accurate code generation, completion, and reasoning capabilities across a wide range of programming languages.
30
-
31
- ## Model Summary
32
-
33
- | Attribute | Value |
34
- |-----------|-------|
35
- | **Model Name** | shenwen-coderV2-Instruct |
36
- | **Base Model** | Qwen2.5-Coder-0.5B-Instruct |
37
- | **Training Data** | Enhanced with zeta-style code data |
38
- | **Parameters** | ~0.5 Billion |
39
- | **Context Length** | 32K tokens |
40
- | **License** | Apache 2.0 |
41
- | **Developer** | shenwenAI |
42
-
43
- ## Key Features
44
-
45
- ### 🎯 Core Capabilities
46
-
47
- - **Code Generation**: Generate high-quality code snippets from natural language descriptions
48
- - **Code Completion**: Intelligent code completion for various programming scenarios
49
- - **Code Reasoning**: Understand and explain code logic and functionality
50
- - **Code Fixing**: Identify and fix common coding errors and bugs
51
-
52
- ### 🌐 Multi-Language Support
53
-
54
- Supports **92+ programming languages** including but not limited to:
55
-
56
- | Popular Languages | Domain-Specific | Modern Languages |
57
- |-------------------|-----------------|------------------|
58
- | Python | SQL | Rust |
59
- | JavaScript/TypeScript | HTML/CSS | Go |
60
- | Java | Shell/Bash | Swift |
61
- | C/C++ | JSON/YAML | Kotlin |
62
- | C# | Markdown | Scala |
63
-
64
- ### ⚡ Lightweight & Efficient
65
-
66
- - Only **0.5 billion parameters** - ideal for resource-constrained environments
67
- - Fast inference speed with low memory footprint
68
- - Can run efficiently on consumer-grade GPUs and even CPUs
69
- - Perfect for edge computing and mobile applications
70
-
71
- ## Model Architecture
72
-
73
- Based on the robust Qwen2.5 architecture with specialized enhancements for code tasks:
74
-
75
- ```
76
- ┌─────────────────────────────────────────────────────────┐
77
- │ shenwen-coderV2-Instruct │
78
- ├─────────────────────────────────────────────────────────┤
79
- │ Base Model: Qwen2.5-Coder-0.5B │
80
- │ ├── Transformer Architecture │
81
- │ ├── RoPE Position Encoding │
82
- │ ├── SwiGLU Activation │
83
- │ ├── RMSNorm Normalization │
84
- │ └── Attention with QKV Bias │
85
- ├─────────────────────────────────────────────────────────┤
86
- │ Enhancements: │
87
- │ ├── Instruction Tuning │
88
- │ └── Zeta-style Code Data Training │
89
- └─────────────────────────────────────────────────────────┘
90
- ```
91
-
92
- **Architecture Details:**
93
-
94
- | Parameter | Value |
95
- |-----------|-------|
96
- | Hidden Size | 896 |
97
- | Number of Layers | 24 |
98
- | Query Heads | 14 |
99
- | KV Heads | 2 |
100
- | Intermediate Size | 4,864 |
101
- | Vocabulary Size | 151,646 |
102
-
103
- ## Training Details
104
-
105
- ### Base Model Training (Qwen2.5-Coder)
106
-
107
- - **Training Tokens**: 5.5 trillion tokens
108
- - **Data Sources**: Source code, text-code grounding, synthetic data
109
- - **Context Length**: Up to 128K tokens (base model), optimized for 32K
110
-
111
- ### Fine-tuning Approach
112
-
113
- The `shenwen-coderV2-Instruct` model is enhanced through:
114
-
115
- 1. **Instruction Tuning**: Fine-tuned on high-quality instruction-response pairs
116
- 2. **Zeta-style Data**: Incorporates code patterns and structures from real-world repositories
117
- 3. **Preference Alignment**: Optimized for human coding preferences and best practices
118
 
119
  ## Usage
120
 
121
- ### Installation
122
-
123
- ```bash
124
- pip install transformers>=4.35.0
125
- pip install accelerate>=0.20.0
126
- pip install torch
127
- ```
128
-
129
- ### Quick Start
130
 
131
  ```python
132
  from transformers import AutoModelForCausalLM, AutoTokenizer
133
 
134
- # Load model and tokenizer
135
  model_name = "shenwenAI/shenwen-coderV2-Instruct"
136
  tokenizer = AutoTokenizer.from_pretrained(model_name)
137
- model = AutoModelForCausalLM.from_pretrained(
138
- model_name,
139
- torch_dtype="auto",
140
- device_map="auto"
141
- )
142
-
143
- # Code generation example
144
- prompt = "Write a Python function to calculate the factorial of a number using recursion:"
145
- messages = [
146
- {"role": "user", "content": prompt}
147
- ]
148
- text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
149
- inputs = tokenizer([text], return_tensors="pt").to(model.device)
150
 
 
 
151
  outputs = model.generate(**inputs, max_new_tokens=512)
152
- response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
153
- print(response)
154
- ```
155
-
156
- ### Using with Ollama
157
-
158
- ```bash
159
- # Pull the model (if available in Ollama registry)
160
- ollama pull shenwenAI/shenwen-coderV2-Instruct
161
-
162
- # Run inference
163
- ollama run shenwenAI/shenwen-coderV2-Instruct
164
  ```
165
 
166
- ### Using with vLLM
167
 
168
  ```python
169
  from vllm import LLM, SamplingParams
170
 
171
  llm = LLM(model="shenwenAI/shenwen-coderV2-Instruct")
172
- sampling_params = SamplingParams(temperature=0.7, max_tokens=512)
173
 
174
- outputs = llm.generate(["Write a JavaScript function to reverse a string:"], sampling_params)
 
175
  print(outputs[0].outputs[0].text)
176
  ```
177
 
178
- ## Benchmark Performance
179
-
180
- The base model (Qwen2.5-Coder-0.5B) demonstrates strong performance on code-related benchmarks:
181
-
182
- | Benchmark | Description | Performance |
183
- |-----------|-------------|--------------|
184
- | HumanEval | Python code generation | Competitive |
185
- | MBPP | Python problem solving | Strong |
186
- | MultiPL-E | Multi-language generation | Excellent |
187
- | McEval | Multi-language code evaluation | Strong |
188
- | CodeGPT | Code understanding | Good |
189
-
190
- > Note: Actual performance may vary based on specific fine-tuning configurations. Users are encouraged to conduct domain-specific evaluations.
191
-
192
- ## Comparison with Base Model
193
-
194
- | Feature | Qwen2.5-Coder-0.5B | shenwen-coderV2-Instruct |
195
- |---------|--------------------|--------------------------|
196
- | Code Generation | ✅ | ✅ Enhanced |
197
- | Instruction Following | Standard | Optimized |
198
- | Real-world Patterns | Limited | Expanded with zeta data |
199
- | User Preferences | Basic alignment | Improved alignment |
200
-
201
- ## Limitations
202
-
203
- 1. **Model Size**: While optimized for efficiency, the 0.5B parameter model may not match larger models (7B, 32B) on complex tasks
204
- 2. **Context Window**: Optimized for 32K context; performance may degrade with very long inputs
205
- 3. **Language Coverage**: Though supports 92+ languages, proficiency varies
206
- 4. **Safety**: Always review generated code for security vulnerabilities and correctness
207
-
208
- ## Best Practices
209
-
210
- ### Do's ✅
211
 
212
- - Review and test all generated code before production use
213
- - Use appropriate temperature settings for different tasks
214
- - Provide clear, specific prompts for better results
215
- - Validate generated code against your specific requirements
216
 
217
- ### Don'ts ❌
218
-
219
- - Don't use unverified code directly in production
220
- - Don't rely solely on the model for security-critical code
221
- - Don't expect perfect code for highly specialized domains
222
-
223
- ## Hardware Requirements
224
-
225
- | Configuration | Minimum | Recommended |
226
- |---------------|---------|-------------|
227
- | GPU VRAM | 2GB | 4GB+ |
228
- | RAM | 8GB | 16GB+ |
229
- | Storage | 1GB | 2GB+ |
230
-
231
- ### CPU Inference
232
 
233
- The model can run on CPU with acceptable performance for smaller tasks:
 
234
 
235
- ```python
236
- model = AutoModelForCausalLM.from_pretrained(
237
- model_name,
238
- torch_dtype="float32",
239
- device_map="cpu"
240
- )
241
  ```
242
 
243
- ## Contributing
244
 
245
- Contributions are welcome! Please feel free to submit issues and pull requests:
246
 
247
- 1. Fork the repository
248
- 2. Create a feature branch (`git checkout -b feature/amazing-feature`)
249
- 3. Commit your changes (`git commit -m 'Add amazing feature'`)
250
- 4. Push to the branch (`git push origin feature/amazing-feature`)
251
- 5. Open a Pull Request
252
 
253
- ## Acknowledgments
254
-
255
- - **Alibaba Qwen Team** for developing the excellent [Qwen2.5-Coder](https://github.com/QwenLM/Qwen) series
256
- - **Zed Industries** for pioneering the [Zeta](https://zed.dev/blog/edit-prediction) edit prediction model
257
- - **Hugging Face** for the open-source ML ecosystem
 
 
258
 
259
  ## License
260
 
261
- This model is released under the **Apache 2.0 License**. Please refer to the LICENSE file for more details.
262
 
263
- ## Citation
264
 
265
- If you use this model in your research, please cite:
 
266
 
267
- ```bibtex
268
- @misc{shenwen-coderV2-Instruct,
269
- author = {shenwenAI},
270
- title = {shenwen-coderV2-Instruct: Enhanced Code Generation Model},
271
- year = {2025},
272
- publisher = {Hugging Face},
273
- url = {https://huggingface.co/shenwenAI/shenwen-coderV2-Instruct}
274
- }
275
- ```
276
 
277
- ## Contact
278
-
279
- - **Author**: shenwenAI
280
- - **Hugging Face**: [shenwenAI](https://huggingface.co/shenwenAI)
281
- - **Issues**: Please open an issue on this repository for bugs or feature requests
282
 
283
  ---
284
 
285
- <div align="center">
286
-
287
- **If you find this model useful, please give it a ⭐ on Hugging Face!**
288
-
289
- </div>
 
3
  ContentProducer: Minimax Agent AI
4
  ContentPropagator: Minimax Agent AI
5
  Label: AIGC
6
+ ProduceID: f3e961de220519135b7936401f9c497b
7
+ PropagateID: f3e961de220519135b7936401f9c497b
8
+ ReservedCode1: 30450221008b926720cc537a337609a6396807cefd6f2465e1a733f88cb72655e7ed3b5a1e0220073082e844d423175f71300fa33a443d56620f52022574850f68f6c58be981c9
9
+ ReservedCode2: 3045022100cee9a5ea6ceee0d1355538f5b52d08108adca91f6b0bd514a775e3cd43616f5e02200b1208fe8656e20f91c6bf8f9d6f4e07d3780abe35035a516e3fe4ffb4de7e6a
10
  ---
11
 
12
  # shenwen-coderV2-Instruct
13
 
14
+ ![Hugging Face](https://huggingface.co/front/assets/huggingface\_logo.svg)
 
 
15
 
16
+ [![Model](https://img.shields.io/badge/Model-shenwen--coderV2--Instruct-blue.svg)](https://huggingface.co/shenwenAI/shenwen-coderV2-Instruct)[![Format](https://img.shields.io/badge/Format-Safetensors-green.svg)](https://huggingface.co/shenwenAI/shenwen-coderV2-Instruct)[![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](https://huggingface.co/shenwenAI/shenwen-coderV2-Instruct)
17
 
18
+ ## Model Overview
 
 
 
19
 
20
+ **shenwen-coderV2-Instruct** is an instruction-tuned code generation model based on Qwen2.5-Coder-0.5B-Instruct, optimized for various code generation tasks.
21
 
22
+ ## Model Details
23
 
24
+ - **Base Model**: Qwen2.5-Coder-0.5B-Instruct
25
+ - **Tensor Type**: BF16
26
+ - **Parameters**: 0.5B
27
+ - **Architecture**: qwen2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
  ## Usage
30
 
31
+ ### Using Transformers
 
 
 
 
 
 
 
 
32
 
33
  ```python
34
  from transformers import AutoModelForCausalLM, AutoTokenizer
35
 
 
36
  model_name = "shenwenAI/shenwen-coderV2-Instruct"
37
  tokenizer = AutoTokenizer.from_pretrained(model_name)
38
+ model = AutoModelForCausalLM.from_pretrained(model_name)
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
+ prompt = "Write a Python function to calculate factorial:"
41
+ inputs = tokenizer(prompt, return_tensors="pt")
42
  outputs = model.generate(**inputs, max_new_tokens=512)
43
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 
 
 
 
 
 
 
 
 
 
 
44
  ```
45
 
46
+ ### Using vLLM
47
 
48
  ```python
49
  from vllm import LLM, SamplingParams
50
 
51
  llm = LLM(model="shenwenAI/shenwen-coderV2-Instruct")
52
+ sampling_params = SamplingParams(temperature=0.8, top_p=0.95, max_tokens=512)
53
 
54
+ prompts = ["Write a Python function to calculate factorial:"]
55
+ outputs = llm.generate(prompts, sampling_params)
56
  print(outputs[0].outputs[0].text)
57
  ```
58
 
59
+ ## Usage with swllm.cpp (Optimized Code Generation)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
 
61
+ For optimized code generation, we recommend using our custom **swllm.cpp** tool:
 
 
 
62
 
63
+ ```bash
64
+ # Clone swllm.cpp
65
+ git clone https://github.com/shenwenAI/swllm.cpp
66
+ cd swllm.cpp
 
 
 
 
 
 
 
 
 
 
 
67
 
68
+ # Build with this model
69
+ # Convert model to GGUF format first if needed
70
 
71
+ # Run inference
72
+ ./build/bin/swllm-cli -m path/to/model.gguf -n 512 -p "Write a Python function to calculate factorial:"
 
 
 
 
73
  ```
74
 
75
+ **swllm.cpp** provides optimized code generation capabilities for enhanced performance and quality.
76
 
77
+ ## Quantization
78
 
79
+ For quantized versions, please visit: [shenwenAI/shenwen-coderV2-GGUF](https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF)
 
 
 
 
80
 
81
+ | Quantization | Size |
82
+ | --- | --- |
83
+ | Q2_K | 339 MB |
84
+ | Q4_K_M | 398 MB |
85
+ | Q5_K_M | 420 MB |
86
+ | Q8_0 | 531 MB |
87
+ | F16 | 994 MB |
88
 
89
  ## License
90
 
91
+ Apache 2.0 - See [LICENSE](https://huggingface.co/shenwenAI/shenwen-coderV2-Instruct/blob/main/LICENSE)
92
 
93
+ ## Acknowledgments
94
 
95
+ - [Qwen Team](https://github.com/QwenLM/Qwen) for Qwen2.5-Coder
96
+ - [shenwenAI](https://huggingface.co/shenwenAI) for model training and optimization
97
 
98
+ ## Connect With Us
 
 
 
 
 
 
 
 
99
 
100
+ - **GitHub**: [https://github.com/shenwenAI](https://github.com/shenwenAI)
101
+ - **HuggingFace**: [https://huggingface.co/shenwenAI](https://huggingface.co/shenwenAI)
102
+ - **Twitter/X**: [https://x.com/shenwenai](https://x.com/shenwenai)
 
 
103
 
104
  ---
105
 
106
+ *If this model is helpful, please consider giving us a star on GitHub and following us on social media!*