scthornton commited on
Commit
964e4a3
·
verified ·
1 Parent(s): 9f9700b

Model save

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,396 +1,62 @@
1
- # Qwen 2.5-Coder 14B - SecureCode Edition
2
-
3
- <div align="center">
4
-
5
- [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
6
- [![Training Dataset](https://img.shields.io/badge/dataset-SecureCode%20v2.0-green.svg)](https://huggingface.co/datasets/scthornton/securecode-v2)
7
- [![Base Model](https://img.shields.io/badge/base-Qwen%202.5%20Coder%2014B-orange.svg)](https://huggingface.co/Qwen/Qwen2.5-Coder-14B-Instruct)
8
- [![perfecXion.ai](https://img.shields.io/badge/by-perfecXion.ai-purple.svg)](https://perfecxion.ai)
9
-
10
- **Enterprise-grade code security - powerful reasoning with production efficiency**
11
-
12
- [🤗 Model Card](https://huggingface.co/scthornton/qwen2.5-coder-14b-securecode) | [📊 Dataset](https://huggingface.co/datasets/scthornton/securecode-v2) | [💻 perfecXion.ai](https://perfecxion.ai)
13
-
14
- </div>
15
-
16
- ---
17
-
18
- ## 🎯 What is This?
19
-
20
- This is **Qwen 2.5-Coder 14B Instruct** fine-tuned on the **SecureCode v2.0 dataset** - the sweet spot between code intelligence and computational efficiency, now enhanced with production-grade security knowledge.
21
-
22
- Qwen 2.5-Coder 14B delivers exceptional code understanding from the same architecture that powers the best-in-class 7B model, scaled up for enterprise complexity. Combined with SecureCode training, this model delivers:
23
-
24
- ✅ **Advanced security reasoning** across complex codebases
25
- ✅ **Production-ready efficiency** - fits comfortably on single GPU
26
- ✅ **Enterprise-scale analysis** with 128K context window
27
- ✅ **Best-in-class code understanding** at the 14B parameter tier
28
-
29
- **The Result:** An enterprise-ready security expert that runs efficiently on standard hardware.
30
-
31
- **Why Qwen 2.5-Coder 14B?** This model offers the optimal balance:
32
- - 🎯 **Superior to smaller models** - More nuanced security analysis than 7B
33
- - ⚡ **More efficient than 32B+** - 2x faster training, lower deployment cost
34
- - 🌍 **92 programming languages** - Comprehensive language coverage
35
- - 📏 **128K context window** - Analyze entire applications at once
36
- - 🏢 **Enterprise deployable** - Runs on single A100 or 2x RTX 4090
37
-
38
- ---
39
-
40
- ## 🚨 The Problem This Solves
41
-
42
- **AI coding assistants produce vulnerable code in 45% of security-relevant scenarios** (Veracode 2025). While smaller models miss nuanced vulnerabilities and larger models demand excessive resources, the 14B tier delivers the security intelligence enterprises need with the efficiency they demand.
43
-
44
- **Real-world enterprise impact:**
45
- - Equifax breach: **$425 million** settlement + reputation damage
46
- - Capital One: **100 million** customer records, $80M fine
47
- - SolarWinds: **18,000** organizations compromised
48
-
49
- Qwen 2.5-Coder 14B SecureCode Edition brings advanced security analysis to enterprise-scale codebases without the infrastructure costs of 32B+ models.
50
-
51
- ---
52
-
53
- ## 💡 Key Features
54
-
55
- ### 🏆 Enterprise-Scale Code Intelligence
56
-
57
- **Qwen 2.5-Coder 14B** delivers exceptional performance:
58
- - HumanEval: **89.0%** pass@1 (surpasses many 30B+ models)
59
- - MBPP: **77.6%** pass@1
60
- - MultiPL-E: **82.1%** average across languages
61
- - Matches or exceeds 32B models on most benchmarks
62
-
63
- Now enhanced with **1,209 security-focused examples** covering OWASP Top 10:2025.
64
-
65
- ### 🔐 Advanced Security Pattern Recognition
66
-
67
- Trained on real-world security incidents:
68
- - **224 examples** of Broken Access Control vulnerabilities
69
- - **199 examples** of Authentication Failures
70
- - **125 examples** of Injection attacks (SQL, Command, XSS)
71
- - **115 examples** of Cryptographic Failures
72
- - Complete **OWASP Top 10:2025** coverage
73
-
74
- ### 🌍 Production-Ready Multi-Language Support
75
-
76
- Fine-tuned on security examples across:
77
- - Python (Django, Flask, FastAPI)
78
- - JavaScript/TypeScript (Express, NestJS, React)
79
- - Java (Spring Boot)
80
- - Go (Gin framework)
81
- - PHP (Laravel, Symfony)
82
- - C# (ASP.NET Core)
83
- - Ruby (Rails)
84
- - Rust (Actix, Rocket)
85
- - **Plus 84 more languages from Qwen's base training**
86
-
87
- ### 📋 Sophisticated Security Analysis
88
-
89
- Every response includes:
90
- 1. **Multi-layered vulnerability analysis** with attack chain identification
91
- 2. **Defense-in-depth implementations** with enterprise patterns
92
- 3. **Concrete exploitation demonstrations** proving security flaws
93
- 4. **Operational guidance** including monitoring, logging, and SIEM integration
94
-
95
  ---
96
-
97
- ## 📊 Training Details
98
-
99
- | Parameter | Value |
100
- |-----------|-------|
101
- | **Base Model** | Qwen/Qwen2.5-Coder-14B-Instruct |
102
- | **Fine-tuning Method** | LoRA (Low-Rank Adaptation) |
103
- | **Training Dataset** | [SecureCode v2.0](https://huggingface.co/datasets/scthornton/securecode-v2) |
104
- | **Dataset Size** | 841 training examples |
105
- | **Training Epochs** | 3 |
106
- | **LoRA Rank (r)** | 16 |
107
- | **LoRA Alpha** | 32 |
108
- | **Learning Rate** | 2e-4 |
109
- | **Quantization** | 4-bit (bitsandbytes) |
110
- | **Trainable Parameters** | ~74M (0.53% of 14B total) |
111
- | **Total Parameters** | 14B |
112
- | **Context Window** | 128K tokens (inherited from base) |
113
- | **GPU Used** | NVIDIA A100 40GB |
114
- | **Training Time** | ~8 hours (estimated) |
115
-
116
- ### Training Methodology
117
-
118
- **LoRA (Low-Rank Adaptation)** preserves Qwen's exceptional code abilities:
119
- - Trains only 0.53% of model parameters
120
- - Maintains SOTA code generation quality
121
- - Adds security-specific knowledge without catastrophic forgetting
122
- - Enables deployment with minimal memory overhead
123
-
124
- **4-bit Quantization** enables efficient training while maintaining model quality.
125
-
126
- **Extended Context:** Qwen's 128K context window allows analyzing entire applications, making it ideal for enterprise security audits.
127
-
128
  ---
129
 
130
- ## 🚀 Usage
131
-
132
- ### Quick Start
133
-
134
- ```python
135
- from transformers import AutoModelForCausalLM, AutoTokenizer
136
- from peft import PeftModel
137
-
138
- # Load base model and tokenizer
139
- base_model = "Qwen/Qwen2.5-Coder-14B-Instruct"
140
- model = AutoModelForCausalLM.from_pretrained(
141
- base_model,
142
- device_map="auto",
143
- torch_dtype="auto",
144
- trust_remote_code=True
145
- )
146
- tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
147
-
148
- # Load SecureCode LoRA adapter
149
- model = PeftModel.from_pretrained(model, "scthornton/qwen2.5-coder-14b-securecode")
150
-
151
- # Analyze enterprise codebase for vulnerabilities
152
- prompt = """### User:
153
- Perform a comprehensive security audit of this microservices authentication system:
154
-
155
- ```python
156
- # auth-service/middleware.py
157
- async def verify_token(request):
158
- token = request.headers.get('Authorization')
159
- if not token:
160
- return None
161
-
162
- payload = jwt.decode(token, settings.SECRET_KEY, algorithms=['HS256'])
163
- user = await User.get(id=payload['user_id'])
164
- return user
165
-
166
- # payment-service/api.py
167
- @app.post('/transfer')
168
- async def transfer_funds(request):
169
- user = await verify_token(request)
170
- amount = request.json.get('amount')
171
- recipient = request.json.get('recipient_id')
172
-
173
- await process_transfer(user.id, recipient, amount)
174
- return {'status': 'success'}
175
- ```
176
-
177
- ### Assistant:
178
- """
179
-
180
- inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
181
- outputs = model.generate(
182
- **inputs,
183
- max_new_tokens=3072,
184
- temperature=0.3, # Lower temperature for precise analysis
185
- top_p=0.95,
186
- do_sample=True
187
- )
188
 
189
- response = tokenizer.decode(outputs[0], skip_special_tokens=True)
190
- print(response)
191
- ```
192
 
193
- ### Enterprise Deployment (4-bit Quantization)
194
 
195
- ```python
196
- from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
197
- from peft import PeftModel
198
 
199
- # 4-bit quantization - runs on 24GB GPU
200
- bnb_config = BitsAndBytesConfig(
201
- load_in_4bit=True,
202
- bnb_4bit_use_double_quant=True,
203
- bnb_4bit_quant_type="nf4",
204
- bnb_4bit_compute_dtype="bfloat16"
205
- )
206
 
207
- base_model = AutoModelForCausalLM.from_pretrained(
208
- "Qwen/Qwen2.5-Coder-14B-Instruct",
209
- quantization_config=bnb_config,
210
- device_map="auto",
211
- trust_remote_code=True
212
- )
213
 
214
- model = PeftModel.from_pretrained(base_model, "scthornton/qwen2.5-coder-14b-securecode")
215
- tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-14B-Instruct", trust_remote_code=True)
216
 
217
- # Production-ready: Runs on RTX 4090, A5000, or A100
218
- ```
219
 
220
- ### Large-Scale Codebase Analysis
221
 
222
- ```python
223
- # Analyze multiple related files with 128K context
224
- files_to_review = {
225
- "auth.py": open("backend/auth.py").read(),
226
- "middleware.py": open("backend/middleware.py").read(),
227
- "models.py": open("backend/models.py").read(),
228
- }
229
 
230
- combined_code = "\n\n".join([f"# {name}\n{code}" for name, code in files_to_review.items()])
231
 
232
- prompt = f"""### User:
233
- Perform a comprehensive security analysis of this authentication system. Identify:
234
- 1. All OWASP Top 10 vulnerabilities
235
- 2. Attack chains that combine multiple vulnerabilities
236
- 3. Race conditions and timing attacks
237
- 4. Authorization bypass opportunities
 
 
 
 
 
238
 
239
- ```python
240
- {combined_code}
241
- ```
242
-
243
- ### Assistant:
244
- """
245
-
246
- inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=65536).to(model.device)
247
- outputs = model.generate(**inputs, max_new_tokens=4096, temperature=0.3)
248
- analysis = tokenizer.decode(outputs[0], skip_special_tokens=True)
249
- print(analysis)
250
- ```
251
-
252
- ---
253
-
254
- ## 🎯 Use Cases
255
-
256
- ### 1. **Enterprise Security Architecture Review**
257
- Analyze complex multi-service architectures:
258
- ```
259
- Review this microservices platform for security vulnerabilities, focusing on authentication flows, service-to-service authorization, and data validation boundaries
260
- ```
261
-
262
- ### 2. **Large Codebase Vulnerability Scanning**
263
- With 128K context, analyze entire modules:
264
- ```
265
- Audit this 10,000-line payment processing system for injection attacks, authorization bypasses, and cryptographic failures
266
- ```
267
-
268
- ### 3. **Advanced Attack Chain Analysis**
269
- Identify sophisticated multi-step attacks:
270
- ```
271
- Analyze how an attacker could chain CSRF, XSS, and session fixation to achieve account takeover in this web application
272
- ```
273
-
274
- ### 4. **Production Security Hardening**
275
- Get operational security recommendations:
276
- ```
277
- Design a defense-in-depth security architecture for this e-commerce platform handling 1M+ transactions/day
278
- ```
279
-
280
- ### 5. **Compliance-Focused Code Generation**
281
- Generate SOC 2, PCI-DSS, HIPAA-compliant code:
282
- ```
283
- Create a HIPAA-compliant patient data API with comprehensive audit logging, encryption at rest and in transit, and role-based access control
284
- ```
285
-
286
- ---
287
-
288
- ## ⚠️ Limitations
289
-
290
- ### What This Model Does Well
291
- ✅ Complex security reasoning across large codebases
292
- ✅ Multi-file analysis with 128K context window
293
- ✅ Advanced attack chain identification
294
- ✅ Enterprise-scale architecture security review
295
- ✅ Detailed operational guidance
296
-
297
- ### What This Model Doesn't Do
298
- ❌ **Not a security scanner** - Use tools like Semgrep, CodeQL, or Snyk
299
- ❌ **Not a penetration testing tool** - Cannot perform active exploitation
300
- ❌ **Not legal/compliance advice** - Consult security professionals
301
- ❌ **Not a replacement for security experts** - Critical systems need professional review
302
-
303
- ### Known Characteristics
304
- - Detailed analysis may generate verbose responses (trained on comprehensive security explanations)
305
- - Optimized for common vulnerability patterns (OWASP Top 10) vs novel 0-days
306
- - Best performance on code within OWASP taxonomy
307
-
308
- ---
309
-
310
- ## 📈 Performance Benchmarks
311
-
312
- ### Hardware Requirements
313
-
314
- **Minimum:**
315
- - 28GB RAM
316
- - 20GB GPU VRAM (with 4-bit quantization)
317
-
318
- **Recommended:**
319
- - 48GB RAM
320
- - 24GB+ GPU (RTX 4090, A5000, A100)
321
-
322
- **Inference Speed (on A100 40GB):**
323
- - ~55 tokens/second (4-bit quantization)
324
- - ~75 tokens/second (bfloat16)
325
-
326
- ### Code Generation Benchmarks (Base Qwen 2.5-Coder)
327
-
328
- | Benchmark | Score | Rank |
329
- |-----------|-------|------|
330
- | HumanEval | 89.0% | #1 in 14B class |
331
- | MBPP | 77.6% | Top tier |
332
- | LiveCodeBench | 38.4% | Top 5 overall |
333
- | MultiPL-E | 82.1% | Best multi-language |
334
-
335
- **Performance:** Matches or exceeds many 32B+ models while requiring half the compute.
336
-
337
- ---
338
-
339
- ## 🔬 Dataset Information
340
-
341
- Trained on **[SecureCode v2.0](https://huggingface.co/datasets/scthornton/securecode-v2)**:
342
- - **1,209 examples** with real CVE grounding
343
- - **100% incident validation**
344
- - **OWASP Top 10:2025** complete coverage
345
- - **Expert security review**
346
-
347
- ---
348
-
349
- ## 📄 License
350
-
351
- **Model:** Apache 2.0 | **Dataset:** CC BY-NC-SA 4.0
352
-
353
- ---
354
-
355
- ## 📚 Citation
356
-
357
- ```bibtex
358
- @misc{thornton2025securecode-qwen14b,
359
- title={Qwen 2.5-Coder 14B - SecureCode Edition},
360
- author={Thornton, Scott},
361
- year={2025},
362
- publisher={perfecXion.ai},
363
- url={https://huggingface.co/scthornton/qwen2.5-coder-14b-securecode}
364
- }
365
- ```
366
-
367
- ---
368
-
369
- ## 🙏 Acknowledgments
370
-
371
- - **Alibaba Cloud & Qwen Team** for the exceptional Qwen 2.5-Coder base model
372
- - **OWASP Foundation** for vulnerability taxonomy
373
- - **MITRE** for CVE database
374
- - **Enterprise security community** for real-world validation
375
-
376
- ---
377
-
378
- ## 🔗 Related Models
379
-
380
- - **[llama-3.2-3b-securecode](https://huggingface.co/scthornton/llama-3.2-3b-securecode)** - Most accessible (3B)
381
- - **[qwen-coder-7b-securecode](https://huggingface.co/scthornton/qwen-coder-7b-securecode)** - Smaller Qwen variant (7B)
382
- - **[deepseek-coder-6.7b-securecode](https://huggingface.co/scthornton/deepseek-coder-6.7b-securecode)** - Security-optimized (6.7B)
383
- - **[codellama-13b-securecode](https://huggingface.co/scthornton/codellama-13b-securecode)** - Enterprise trusted (13B)
384
- - **[starcoder2-15b-securecode](https://huggingface.co/scthornton/starcoder2-15b-securecode)** - Multi-language (15B)
385
-
386
- [View Collection](https://huggingface.co/collections/scthornton/securecode)
387
-
388
- ---
389
 
390
- <div align="center">
391
 
392
- **Built with ❤️ for secure enterprise software development**
393
 
394
- [perfecXion.ai](https://perfecxion.ai) | [Contact](mailto:scott@perfecxion.ai)
395
 
396
- </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ library_name: peft
3
+ license: apache-2.0
4
+ base_model: Qwen/Qwen2.5-Coder-14B-Instruct
5
+ tags:
6
+ - base_model:adapter:Qwen/Qwen2.5-Coder-14B-Instruct
7
+ - lora
8
+ - transformers
9
+ datasets:
10
+ - securecode-v2
11
+ pipeline_tag: text-generation
12
+ model-index:
13
+ - name: qwen2.5-coder-14b-securecode
14
+ results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  ---
16
 
17
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
+ should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
+ # qwen2.5-coder-14b-securecode
 
 
21
 
22
+ This model is a fine-tuned version of [Qwen/Qwen2.5-Coder-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-14B-Instruct) on the securecode-v2 dataset.
23
 
24
+ ## Model description
 
 
25
 
26
+ More information needed
 
 
 
 
 
 
27
 
28
+ ## Intended uses & limitations
 
 
 
 
 
29
 
30
+ More information needed
 
31
 
32
+ ## Training and evaluation data
 
33
 
34
+ More information needed
35
 
36
+ ## Training procedure
 
 
 
 
 
 
37
 
38
+ ### Training hyperparameters
39
 
40
+ The following hyperparameters were used during training:
41
+ - learning_rate: 0.0002
42
+ - train_batch_size: 1
43
+ - eval_batch_size: 8
44
+ - seed: 42
45
+ - gradient_accumulation_steps: 16
46
+ - total_train_batch_size: 16
47
+ - optimizer: Use paged_adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
+ - lr_scheduler_type: cosine
49
+ - lr_scheduler_warmup_steps: 100
50
+ - num_epochs: 3
51
 
52
+ ### Training results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
 
 
54
 
 
55
 
56
+ ### Framework versions
57
 
58
+ - PEFT 0.18.1
59
+ - Transformers 4.57.6
60
+ - Pytorch 2.7.1+cu128
61
+ - Datasets 2.16.0
62
+ - Tokenizers 0.22.2
added_tokens.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "</tool_call>": 151658,
3
+ "<tool_call>": 151657,
4
+ "<|box_end|>": 151649,
5
+ "<|box_start|>": 151648,
6
+ "<|endoftext|>": 151643,
7
+ "<|file_sep|>": 151664,
8
+ "<|fim_middle|>": 151660,
9
+ "<|fim_pad|>": 151662,
10
+ "<|fim_prefix|>": 151659,
11
+ "<|fim_suffix|>": 151661,
12
+ "<|im_end|>": 151645,
13
+ "<|im_start|>": 151644,
14
+ "<|image_pad|>": 151655,
15
+ "<|object_ref_end|>": 151647,
16
+ "<|object_ref_start|>": 151646,
17
+ "<|quad_end|>": 151651,
18
+ "<|quad_start|>": 151650,
19
+ "<|repo_name|>": 151663,
20
+ "<|video_pad|>": 151656,
21
+ "<|vision_end|>": 151653,
22
+ "<|vision_pad|>": 151654,
23
+ "<|vision_start|>": 151652
24
+ }
chat_template.jinja ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- if tools %}
2
+ {{- '<|im_start|>system\n' }}
3
+ {%- if messages[0]['role'] == 'system' %}
4
+ {{- messages[0]['content'] }}
5
+ {%- else %}
6
+ {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}
7
+ {%- endif %}
8
+ {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
9
+ {%- for tool in tools %}
10
+ {{- "\n" }}
11
+ {{- tool | tojson }}
12
+ {%- endfor %}
13
+ {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
14
+ {%- else %}
15
+ {%- if messages[0]['role'] == 'system' %}
16
+ {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}
17
+ {%- else %}
18
+ {{- '<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n' }}
19
+ {%- endif %}
20
+ {%- endif %}
21
+ {%- for message in messages %}
22
+ {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}
23
+ {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
24
+ {%- elif message.role == "assistant" %}
25
+ {{- '<|im_start|>' + message.role }}
26
+ {%- if message.content %}
27
+ {{- '\n' + message.content }}
28
+ {%- endif %}
29
+ {%- for tool_call in message.tool_calls %}
30
+ {%- if tool_call.function is defined %}
31
+ {%- set tool_call = tool_call.function %}
32
+ {%- endif %}
33
+ {{- '\n<tool_call>\n{"name": "' }}
34
+ {{- tool_call.name }}
35
+ {{- '", "arguments": ' }}
36
+ {{- tool_call.arguments | tojson }}
37
+ {{- '}\n</tool_call>' }}
38
+ {%- endfor %}
39
+ {{- '<|im_end|>\n' }}
40
+ {%- elif message.role == "tool" %}
41
+ {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %}
42
+ {{- '<|im_start|>user' }}
43
+ {%- endif %}
44
+ {{- '\n<tool_response>\n' }}
45
+ {{- message.content }}
46
+ {{- '\n</tool_response>' }}
47
+ {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
48
+ {{- '<|im_end|>\n' }}
49
+ {%- endif %}
50
+ {%- endif %}
51
+ {%- endfor %}
52
+ {%- if add_generation_prompt %}
53
+ {{- '<|im_start|>assistant\n' }}
54
+ {%- endif %}
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>",
5
+ "<|object_ref_start|>",
6
+ "<|object_ref_end|>",
7
+ "<|box_start|>",
8
+ "<|box_end|>",
9
+ "<|quad_start|>",
10
+ "<|quad_end|>",
11
+ "<|vision_start|>",
12
+ "<|vision_end|>",
13
+ "<|vision_pad|>",
14
+ "<|image_pad|>",
15
+ "<|video_pad|>"
16
+ ],
17
+ "eos_token": {
18
+ "content": "<|im_end|>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ },
24
+ "pad_token": "<|im_end|>"
25
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ef098fb53e76cfa06a012b3826f5889a5ab693afa875d97c3353ae1edb9a1dc
3
+ size 11422173
tokenizer_config.json ADDED
@@ -0,0 +1,207 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "151643": {
6
+ "content": "<|endoftext|>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "151644": {
14
+ "content": "<|im_start|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "151645": {
22
+ "content": "<|im_end|>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ },
29
+ "151646": {
30
+ "content": "<|object_ref_start|>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": true
36
+ },
37
+ "151647": {
38
+ "content": "<|object_ref_end|>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": true
44
+ },
45
+ "151648": {
46
+ "content": "<|box_start|>",
47
+ "lstrip": false,
48
+ "normalized": false,
49
+ "rstrip": false,
50
+ "single_word": false,
51
+ "special": true
52
+ },
53
+ "151649": {
54
+ "content": "<|box_end|>",
55
+ "lstrip": false,
56
+ "normalized": false,
57
+ "rstrip": false,
58
+ "single_word": false,
59
+ "special": true
60
+ },
61
+ "151650": {
62
+ "content": "<|quad_start|>",
63
+ "lstrip": false,
64
+ "normalized": false,
65
+ "rstrip": false,
66
+ "single_word": false,
67
+ "special": true
68
+ },
69
+ "151651": {
70
+ "content": "<|quad_end|>",
71
+ "lstrip": false,
72
+ "normalized": false,
73
+ "rstrip": false,
74
+ "single_word": false,
75
+ "special": true
76
+ },
77
+ "151652": {
78
+ "content": "<|vision_start|>",
79
+ "lstrip": false,
80
+ "normalized": false,
81
+ "rstrip": false,
82
+ "single_word": false,
83
+ "special": true
84
+ },
85
+ "151653": {
86
+ "content": "<|vision_end|>",
87
+ "lstrip": false,
88
+ "normalized": false,
89
+ "rstrip": false,
90
+ "single_word": false,
91
+ "special": true
92
+ },
93
+ "151654": {
94
+ "content": "<|vision_pad|>",
95
+ "lstrip": false,
96
+ "normalized": false,
97
+ "rstrip": false,
98
+ "single_word": false,
99
+ "special": true
100
+ },
101
+ "151655": {
102
+ "content": "<|image_pad|>",
103
+ "lstrip": false,
104
+ "normalized": false,
105
+ "rstrip": false,
106
+ "single_word": false,
107
+ "special": true
108
+ },
109
+ "151656": {
110
+ "content": "<|video_pad|>",
111
+ "lstrip": false,
112
+ "normalized": false,
113
+ "rstrip": false,
114
+ "single_word": false,
115
+ "special": true
116
+ },
117
+ "151657": {
118
+ "content": "<tool_call>",
119
+ "lstrip": false,
120
+ "normalized": false,
121
+ "rstrip": false,
122
+ "single_word": false,
123
+ "special": false
124
+ },
125
+ "151658": {
126
+ "content": "</tool_call>",
127
+ "lstrip": false,
128
+ "normalized": false,
129
+ "rstrip": false,
130
+ "single_word": false,
131
+ "special": false
132
+ },
133
+ "151659": {
134
+ "content": "<|fim_prefix|>",
135
+ "lstrip": false,
136
+ "normalized": false,
137
+ "rstrip": false,
138
+ "single_word": false,
139
+ "special": false
140
+ },
141
+ "151660": {
142
+ "content": "<|fim_middle|>",
143
+ "lstrip": false,
144
+ "normalized": false,
145
+ "rstrip": false,
146
+ "single_word": false,
147
+ "special": false
148
+ },
149
+ "151661": {
150
+ "content": "<|fim_suffix|>",
151
+ "lstrip": false,
152
+ "normalized": false,
153
+ "rstrip": false,
154
+ "single_word": false,
155
+ "special": false
156
+ },
157
+ "151662": {
158
+ "content": "<|fim_pad|>",
159
+ "lstrip": false,
160
+ "normalized": false,
161
+ "rstrip": false,
162
+ "single_word": false,
163
+ "special": false
164
+ },
165
+ "151663": {
166
+ "content": "<|repo_name|>",
167
+ "lstrip": false,
168
+ "normalized": false,
169
+ "rstrip": false,
170
+ "single_word": false,
171
+ "special": false
172
+ },
173
+ "151664": {
174
+ "content": "<|file_sep|>",
175
+ "lstrip": false,
176
+ "normalized": false,
177
+ "rstrip": false,
178
+ "single_word": false,
179
+ "special": false
180
+ }
181
+ },
182
+ "additional_special_tokens": [
183
+ "<|im_start|>",
184
+ "<|im_end|>",
185
+ "<|object_ref_start|>",
186
+ "<|object_ref_end|>",
187
+ "<|box_start|>",
188
+ "<|box_end|>",
189
+ "<|quad_start|>",
190
+ "<|quad_end|>",
191
+ "<|vision_start|>",
192
+ "<|vision_end|>",
193
+ "<|vision_pad|>",
194
+ "<|image_pad|>",
195
+ "<|video_pad|>"
196
+ ],
197
+ "bos_token": null,
198
+ "clean_up_tokenization_spaces": false,
199
+ "eos_token": "<|im_end|>",
200
+ "errors": "replace",
201
+ "extra_special_tokens": {},
202
+ "model_max_length": 32768,
203
+ "pad_token": "<|im_end|>",
204
+ "split_special_tokens": false,
205
+ "tokenizer_class": "Qwen2Tokenizer",
206
+ "unk_token": null
207
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff