scthornton commited on
Commit
7a40f21
·
verified ·
1 Parent(s): 0abce6e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +131 -676
README.md CHANGED
@@ -2,751 +2,206 @@
2
  license: apache-2.0
3
  base_model: ibm-granite/granite-20b-code-instruct-8k
4
  tags:
5
- - code
6
- - security
7
- - granite
8
- - ibm
9
- - securecode
10
- - owasp
11
- - vulnerability-detection
 
 
 
12
  datasets:
13
- - scthornton/securecode-v2
14
- language:
15
- - en
16
- library_name: transformers
17
  pipeline_tag: text-generation
18
- arxiv: 2512.18542
 
 
19
  ---
20
 
21
- # IBM Granite 20B Code - SecureCode Edition
22
 
23
  <div align="center">
24
 
25
- [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
26
- [![Training Dataset](https://img.shields.io/badge/dataset-SecureCode%20v2.0-green.svg)](https://huggingface.co/datasets/scthornton/securecode-v2)
27
- [![Base Model](https://img.shields.io/badge/base-Granite%2020B%20Code-orange.svg)](https://huggingface.co/ibm-granite/granite-20b-code-instruct-8k)
28
- [![perfecXion.ai](https://img.shields.io/badge/by-perfecXion.ai-purple.svg)](https://perfecxion.ai)
29
-
30
- **🏢 Enterprise-scale security intelligence with IBM trust**
31
 
32
- The most powerful model in the SecureCode collection. When you need maximum code understanding, complex reasoning, and IBM's enterprise-grade reliability.
33
 
34
- [📄 Paper](https://arxiv.org/abs/2512.18542) | [🤗 Model Hub](https://huggingface.co/scthornton/granite-20b-code-securecode) | [📊 Dataset](https://huggingface.co/datasets/scthornton/securecode-v2) | [💻 perfecXion.ai](https://perfecxion.ai) | [📚 Collection](https://huggingface.co/collections/scthornton/securecode)
35
 
36
  </div>
37
 
38
  ---
39
 
40
- ## 🎯 Quick Decision Guide
41
-
42
- **Choose This Model If:**
43
- - ✅ You need **maximum code understanding** and security reasoning capability
44
- - ✅ You're analyzing **complex enterprise architectures** with intricate attack surfaces
45
- - ✅ You require **IBM enterprise trust** and brand recognition
46
- - ✅ You have **datacenter infrastructure** (48GB+ GPU)
47
- - ✅ You're conducting **professional security audits** requiring comprehensive analysis
48
- - ✅ You need the **most sophisticated** security intelligence in the collection
49
-
50
- **Consider Smaller Models If:**
51
- - ⚠️ You're on consumer hardware (→ Llama 3B, Qwen 7B)
52
- - ⚠️ You prioritize inference speed over depth (→ Qwen 7B/14B)
53
- - ⚠️ You're building IDE tools needing fast response (→ Llama 3B, DeepSeek 6.7B)
54
- - ⚠️ Budget is primary concern (→ any 7B/13B model)
55
-
56
- ---
57
-
58
- ## 📊 Collection Positioning
59
-
60
- | Model | Size | Best For | Hardware | Inference Speed | Unique Strength |
61
- |-------|------|----------|----------|-----------------|-----------------|
62
- | Llama 3.2 3B | 3B | Consumer deployment | 8GB RAM | ⚡⚡⚡ Fastest | Most accessible |
63
- | DeepSeek 6.7B | 6.7B | Security-optimized baseline | 16GB RAM | ⚡⚡ Fast | Security architecture |
64
- | Qwen 7B | 7B | Best code understanding | 16GB RAM | ⚡⚡ Fast | Best-in-class 7B |
65
- | CodeGemma 7B | 7B | Google ecosystem | 16GB RAM | ⚡⚡ Fast | Instruction following |
66
- | CodeLlama 13B | 13B | Enterprise trust | 24GB RAM | ⚡ Medium | Meta brand, proven |
67
- | Qwen 14B | 14B | Advanced analysis | 32GB RAM | ⚡ Medium | 128K context window |
68
- | StarCoder2 15B | 15B | Multi-language specialist | 32GB RAM | ⚡ Medium | 600+ languages |
69
- | **Granite 20B** | **20B** | **Enterprise-scale** | **48GB RAM** | **Medium** | **IBM trust, largest, most capable** |
70
 
71
- **This Model's Position:** The flagship. Maximum security intelligence, enterprise-grade reliability, IBM brand trust. For when quality matters more than speed.
72
 
73
- ---
74
-
75
- ## 🚨 The Problem This Solves
 
76
 
77
- **Critical enterprise security gaps require sophisticated analysis.** When a breach costs **$4.45 million on average** (IBM 2024 Cost of Data Breach Report) and 45% of AI-generated code contains vulnerabilities, enterprises need the most capable security analysis available.
78
 
79
- **Real-world enterprise impact:**
80
- - **Equifax** (SQL injection): $425 million settlement + 13-year brand recovery
81
- - **Capital One** (SSRF): 100 million customer records, $80M fine, 2 years of remediation
82
- - **SolarWinds** (supply chain): 18,000 organizations compromised, $18M settlement
83
- - **LastPass** (cryptographic failures): 30M users affected, company reputation destroyed
84
 
85
- **IBM Granite 20B SecureCode Edition** provides the deepest security analysis available in the open-source ecosystem, backed by IBM's enterprise heritage and trust.
 
 
 
 
 
 
 
 
 
 
86
 
87
- ---
88
 
89
- ## 💡 What is This?
90
-
91
- This is **IBM Granite 20B Code Instruct** fine-tuned on the **SecureCode v2.0 dataset** - IBM's enterprise-grade code model enhanced with production-grade security expertise covering the complete OWASP Top 10:2025.
92
-
93
- IBM Granite models are built on IBM's 40+ years of enterprise software experience, trained on **3.5+ trillion tokens** of code and technical data, with a focus on enterprise deployment reliability.
94
-
95
- Combined with SecureCode training, this model delivers:
96
-
97
- ✅ **Maximum security intelligence** - 20B parameters for deep, nuanced analysis
98
- ✅ **Enterprise-grade reliability** - IBM's proven track record and support ecosystem
99
- ✅ **Comprehensive vulnerability detection** across complex architectures
100
- ✅ **Production-ready trust** - Permissive Apache 2.0 license
101
- ✅ **Advanced reasoning** - Handles multi-layered attack chain analysis
102
-
103
- **The Result:** The most capable security-aware code model in the open-source ecosystem.
104
-
105
- **Why IBM Granite 20B?** This model is the enterprise choice:
106
- - 🏢 **IBM enterprise heritage** - 40+ years of enterprise software leadership
107
- - 🔐 **Largest in collection** - 20B parameters = maximum reasoning capability
108
- - 📋 **Enterprise compliance ready** - Designed for regulated industries
109
- - ⚖️ **Apache 2.0 licensed** - Full commercial freedom
110
- - 🎯 **Security-first training** - Built for mission-critical applications
111
- - 🌍 **Broad language support** - 116+ programming languages
112
-
113
- Perfect for Fortune 500 companies, financial services, healthcare, government, and any organization where security analysis quality is paramount.
114
-
115
- ---
116
-
117
- ## 🔐 Security Training Coverage
118
-
119
- ### Real-World Vulnerability Distribution
120
-
121
- Trained on 1,209 security examples with real CVE grounding:
122
-
123
- | OWASP Category | Examples | Real Incidents |
124
- |----------------|----------|----------------|
125
- | **Broken Access Control** | 224 | Equifax, Facebook, Uber |
126
- | **Authentication Failures** | 199 | SolarWinds, Okta, LastPass |
127
- | **Injection Attacks** | 125 | Capital One, Yahoo, LinkedIn |
128
- | **Cryptographic Failures** | 115 | LastPass, Adobe, Dropbox |
129
- | **Security Misconfiguration** | 98 | Tesla, MongoDB, Elasticsearch |
130
- | **Vulnerable Components** | 87 | Log4Shell, Heartbleed, Struts |
131
- | **Identification/Auth Failures** | 84 | Twitter, GitHub, Reddit |
132
- | **Software/Data Integrity** | 78 | SolarWinds, Codecov, npm |
133
- | **Logging Failures** | 71 | Various incident responses |
134
- | **SSRF** | 69 | Capital One, Shopify |
135
- | **Insecure Design** | 59 | Architectural flaws |
136
-
137
- ### Enterprise-Grade Multi-Language Support
138
-
139
- Fine-tuned on security examples across:
140
- - **Python** (Django, Flask, FastAPI) - 280 examples
141
- - **JavaScript/TypeScript** (Express, NestJS, React) - 245 examples
142
- - **Java** (Spring Boot, Jakarta EE) - 178 examples
143
- - **Go** (Gin, Echo, standard library) - 145 examples
144
- - **PHP** (Laravel, Symfony) - 112 examples
145
- - **C#** (ASP.NET Core, .NET 6+) - 89 examples
146
- - **Ruby** (Rails, Sinatra) - 67 examples
147
- - **Rust** (Actix, Rocket, Axum) - 45 examples
148
- - **C/C++** (Memory safety patterns) - 28 examples
149
- - **Plus 107+ additional languages from Granite's base training**
150
-
151
- ---
152
-
153
- ## 🎯 Deployment Scenarios
154
-
155
- ### Scenario 1: Enterprise Security Audit Platform
156
-
157
- **Professional security assessments for Fortune 500 clients.**
158
-
159
- **Hardware:** Datacenter GPU (A100 80GB or 2x A100 40GB)
160
- **Throughput:** 10-15 comprehensive audits/day
161
- **Use Case:** Professional security consulting
162
-
163
- **Value Proposition:**
164
- - Identify vulnerabilities human auditors miss
165
- - Consistent, comprehensive OWASP coverage
166
- - Scales expert security knowledge
167
- - Reduces audit time by 60-70%
168
-
169
- **ROI:** A single prevented breach pays for years of infrastructure. Typical large enterprise security audit costs $150K-500K. This model can handle preliminary analysis, allowing human experts to focus on novel vulnerabilities and strategic recommendations.
170
-
171
- ---
172
-
173
- ### Scenario 2: Financial Services Security Platform
174
-
175
- **Regulatory compliance and security for banking applications.**
176
-
177
- **Hardware:** Private cloud A100 cluster
178
- **Compliance:** SOC 2, PCI-DSS, GDPR, CCPA
179
- **Use Case:** Pre-deployment security validation
180
-
181
- **Regulatory Benefits:**
182
- - Automated OWASP Top 10 verification
183
- - Audit trail generation
184
- - Compliance report automation
185
- - Reduces regulatory risk
186
-
187
- **ROI:** Regulatory fines cost millions. **Capital One:** $80M fine. **Equifax:** $425M settlement. Preventing one major breach justifies entire deployment.
188
-
189
- ---
190
-
191
- ### Scenario 3: Healthcare Application Security
192
-
193
- **HIPAA-compliant code review for medical systems.**
194
-
195
- **Hardware:** Secure private deployment
196
- **Compliance:** HIPAA, HITECH, FDA software validation
197
- **Use Case:** Medical device and EHR security
198
-
199
- **Critical Healthcare Requirements:**
200
- - Patient data protection (HIPAA)
201
- - Audit logging and compliance
202
- - Cryptographic requirements
203
- - Access control verification
204
-
205
- **Impact:** Healthcare breaches average **$10.93 million per incident** (IBM 2024). Single prevented breach pays for multi-year deployment.
206
-
207
- ---
208
-
209
- ### Scenario 4: Government & Defense Applications
210
-
211
- **Security analysis for critical infrastructure.**
212
-
213
- **Hardware:** Air-gapped secure environment
214
- **Clearance:** Can be deployed in classified environments
215
- **Use Case:** Critical infrastructure security
216
-
217
- **Government Benefits:**
218
- - No external dependencies (fully local)
219
- - Apache 2.0 license (government-friendly)
220
- - IBM enterprise support available
221
- - Meets government security standards
222
-
223
- ---
224
-
225
- ## 📊 Training Details
226
-
227
- | Parameter | Value | Why This Matters |
228
- |-----------|-------|------------------|
229
- | **Base Model** | ibm-granite/granite-20b-code-instruct-8k | IBM's enterprise-grade foundation |
230
- | **Fine-tuning Method** | LoRA (Low-Rank Adaptation) | Efficient training, preserves base capabilities |
231
- | **Training Dataset** | [SecureCode v2.0](https://huggingface.co/datasets/scthornton/securecode-v2) | 100% incident-grounded, expert-validated |
232
- | **Dataset Size** | 841 training examples | Focused on quality over quantity |
233
- | **Training Epochs** | 3 | Optimal convergence without overfitting |
234
- | **LoRA Rank (r)** | 16 | Balanced parameter efficiency |
235
- | **LoRA Alpha** | 32 | Learning rate scaling factor |
236
- | **Learning Rate** | 2e-4 | Standard for LoRA fine-tuning |
237
- | **Quantization** | 4-bit (bitsandbytes) | Enables efficient training |
238
- | **Trainable Parameters** | ~105M (0.525% of 20B total) | Minimal parameters, maximum impact |
239
- | **Total Parameters** | 20B | Maximum reasoning capability |
240
- | **Context Window** | 8K tokens | Enterprise file analysis |
241
- | **GPU Used** | NVIDIA A100 40GB | Enterprise training infrastructure |
242
- | **Training Time** | ~12-14 hours (estimated) | Deep security learning |
243
-
244
- ### Training Methodology
245
-
246
- **LoRA (Low-Rank Adaptation)** was chosen for enterprise reliability:
247
- 1. **Efficiency:** Trains only 0.525% of model parameters (105M vs 20B)
248
- 2. **Quality:** Preserves IBM Granite's enterprise capabilities
249
- 3. **Deployability:** Can be deployed alongside base model for versioning
250
-
251
- **4-bit Quantization** enables efficient training while maintaining enterprise-grade quality.
252
-
253
- **IBM Granite Foundation:** Built on IBM's 40+ years of enterprise software experience, optimized for:
254
- - Reliability and consistency
255
- - Enterprise deployment patterns
256
- - Regulatory compliance requirements
257
- - Long-term support and stability
258
-
259
- ---
260
-
261
- ## 🚀 Usage
262
-
263
- ### Quick Start
264
 
265
  ```python
266
- from transformers import AutoModelForCausalLM, AutoTokenizer
267
  from peft import PeftModel
268
-
269
- # Load IBM Granite base model
270
- base_model = "ibm-granite/granite-20b-code-instruct-8k"
271
- model = AutoModelForCausalLM.from_pretrained(
272
- base_model,
273
- device_map="auto",
274
- torch_dtype="auto",
275
- trust_remote_code=True
276
- )
277
- tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
278
-
279
- # Load SecureCode LoRA adapter
280
- model = PeftModel.from_pretrained(model, "scthornton/granite-20b-code-securecode")
281
-
282
- # Enterprise security analysis
283
- prompt = """### User:
284
- Conduct a comprehensive security audit of this enterprise authentication system. Analyze for:
285
- 1. OWASP Top 10 vulnerabilities
286
- 2. Attack chain opportunities
287
- 3. Compliance gaps (SOC 2, PCI-DSS)
288
- 4. Architectural weaknesses
289
-
290
- ```python
291
- # Enterprise SSO Implementation
292
- class EnterpriseAuthService:
293
- def __init__(self):
294
- self.secret = os.getenv('JWT_SECRET')
295
- self.db = DatabasePool()
296
-
297
- async def authenticate(self, credentials):
298
- user = await self.db.query(
299
- f"SELECT * FROM users WHERE email='{credentials.email}' AND password='{credentials.password}'"
300
- )
301
- if user:
302
- token = jwt.encode({'user_id': user.id}, self.secret)
303
- return {'token': token, 'success': True}
304
- return {'success': False}
305
-
306
- async def verify_token(self, token):
307
- try:
308
- payload = jwt.decode(token, self.secret, algorithms=['HS256'])
309
- return payload
310
- except:
311
- return None
312
- ```
313
-
314
- ### Assistant:
315
- """
316
-
317
- inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
318
- outputs = model.generate(
319
- **inputs,
320
- max_new_tokens=4096,
321
- temperature=0.2, # Lower temperature for precise enterprise analysis
322
- top_p=0.95,
323
- do_sample=True
324
- )
325
-
326
- response = tokenizer.decode(outputs[0], skip_special_tokens=True)
327
- print(response)
328
- ```
329
-
330
- ---
331
-
332
- ### Enterprise Deployment (4-bit Quantization)
333
-
334
- ```python
335
  from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
336
- from peft import PeftModel
337
 
338
- # 4-bit quantization - runs on 40GB GPU
339
  bnb_config = BitsAndBytesConfig(
340
  load_in_4bit=True,
341
- bnb_4bit_use_double_quant=True,
342
  bnb_4bit_quant_type="nf4",
343
- bnb_4bit_compute_dtype="bfloat16"
344
  )
345
 
346
- model = AutoModelForCausalLM.from_pretrained(
347
  "ibm-granite/granite-20b-code-instruct-8k",
348
  quantization_config=bnb_config,
349
  device_map="auto",
350
- trust_remote_code=True
351
- )
352
-
353
- model = PeftModel.from_pretrained(model, "scthornton/granite-20b-code-securecode")
354
- tokenizer = AutoTokenizer.from_pretrained("ibm-granite/granite-20b-code-instruct-8k", trust_remote_code=True)
355
-
356
- # Enterprise-ready: Runs on A100 40GB, A100 80GB, or 2x RTX 4090
357
- ```
358
-
359
- ---
360
-
361
- ### Multi-GPU Deployment (Maximum Performance)
362
-
363
- ```python
364
- from transformers import AutoModelForCausalLM, AutoTokenizer
365
- from peft import PeftModel
366
- import torch
367
-
368
- # Load across multiple GPUs for maximum throughput
369
- model = AutoModelForCausalLM.from_pretrained(
370
- "ibm-granite/granite-20b-code-instruct-8k",
371
- device_map="balanced", # Distribute across available GPUs
372
- torch_dtype=torch.bfloat16,
373
- trust_remote_code=True
374
  )
 
 
375
 
376
- model = PeftModel.from_pretrained(model, "scthornton/granite-20b-code-securecode")
377
- tokenizer = AutoTokenizer.from_pretrained("ibm-granite/granite-20b-code-instruct-8k", trust_remote_code=True)
 
 
378
 
379
- # Optimal for: 2x A100, 4x RTX 4090, or enterprise GPU clusters
380
- # Throughput: 2-3x faster than single GPU
 
381
  ```
382
 
383
- ---
384
-
385
- ## 📈 Performance & Benchmarks
386
 
387
- ### Hardware Requirements
388
 
389
- | Deployment | RAM | GPU VRAM | Tokens/Second | Latency (4K response) | Cost/Month |
390
- |-----------|-----|----------|---------------|----------------------|------------|
391
- | **4-bit Quantized** | 40GB | 32GB | ~35 tok/s | ~115 seconds | $0 (on-prem) or $800-1200 (cloud) |
392
- | **8-bit Quantized** | 64GB | 48GB | ~45 tok/s | ~90 seconds | $0 (on-prem) or $1200-1800 (cloud) |
393
- | **Full Precision (bf16)** | 96GB | 80GB | ~60 tok/s | ~67 seconds | $0 (on-prem) or $2000-3000 (cloud) |
394
- | **Multi-GPU (2x A100)** | 128GB | 160GB | ~120 tok/s | ~33 seconds | Enterprise only |
395
 
396
- ### Real-World Performance
 
 
 
 
397
 
398
- **Tested on A100 40GB** (enterprise GPU):
399
- - **Tokens/second:** ~35 tok/s (4-bit), ~55 tok/s (full precision)
400
- - **Cold start:** ~8 seconds
401
- - **Memory usage:** 28GB (4-bit), 42GB (full precision)
402
- - **Throughput:** 200-300 comprehensive analyses per day
403
 
404
- **Tested on 2x A100 80GB** (multi-GPU):
405
- - **Tokens/second:** ~110-120 tok/s
406
- - **Cold start:** ~6 seconds
407
- - **Throughput:** 500+ analyses per day
 
 
 
 
 
 
 
 
 
 
 
 
408
 
409
- ### Security Analysis Quality
410
-
411
- **The differentiator:** Granite 20B provides the deepest, most nuanced security analysis:
412
- - Identifies **15-25% more vulnerabilities** than 7B models in complex code
413
- - Detects **multi-step attack chains** that smaller models miss
414
- - Provides **enterprise-grade operational guidance** with compliance mapping
415
- - **Reduces false positives** through sophisticated reasoning
416
-
417
- ---
418
 
419
- ## 💰 Cost Analysis
420
 
421
- ### Total Cost of Ownership (TCO) - 1 Year
422
 
423
- **Option 1: On-Premise (Dedicated Server)**
424
- - Hardware: 2x A100 40GB - $20,000 (one-time capital expense)
425
- - Server infrastructure: $5,000
426
- - Electricity: ~$2,400/year
427
- - **Total Year 1:** $27,400
428
- - **Total Year 2+:** $2,400/year
429
 
430
- **Option 2: Cloud GPU (AWS/GCP/Azure)**
431
- - Instance: A100 40GB (p4d.xlarge)
432
- - Cost: ~$3.50/hour
433
- - Usage: 160 hours/month (enterprise team)
434
- - **Total Year 1:** $6,720/year
435
 
436
- **Option 3: Enterprise GPT-4 (for comparison)**
437
- - Cost: $30/1M input tokens, $60/1M output tokens
438
- - Usage: 500M input + 500M output tokens/year
439
- - **Total Year 1:** $45,000/year
440
 
441
- **Option 4: Professional Security Audits (for comparison)**
442
- - Average enterprise security audit: $150,000-500,000
443
- - Frequency: Quarterly (4x/year)
444
- - **Total Year 1:** $600,000-2,000,000
445
 
446
- **ROI Winner:** On-premise deployment pays for itself with **1-2 prevented security audits** or **preventing a single breach** (average cost: $4.45M).
447
 
448
- ---
449
 
450
- ## 🎯 Use Cases & Examples
451
 
452
- ### 1. Enterprise Security Architecture Review
 
 
 
 
 
 
 
 
 
453
 
454
- Analyze complex microservices platforms:
455
 
456
- ```python
457
- prompt = """### User:
458
- Conduct a comprehensive security architecture review of this fintech payment platform. Analyze:
459
- 1. Service-to-service authentication security
460
- 2. Data flow security boundaries
461
- 3. Compliance with PCI-DSS requirements
462
- 4. Attack surface analysis
463
- 5. Defense-in-depth gaps
464
-
465
- [Include microservices code across auth-service, payment-service, notification-service]
466
-
467
- ### Assistant:
468
- """
469
- ```
470
 
471
- **Model Response:** Provides 20-30 page comprehensive analysis with specific vulnerability findings, attack chain scenarios, compliance gaps, and remediation priorities.
 
 
 
 
472
 
473
- ---
474
 
475
- ### 2. Regulatory Compliance Validation
 
 
 
 
476
 
477
- Validate code against regulatory requirements:
 
 
 
478
 
479
- ```python
480
- prompt = """### User:
481
- Analyze this healthcare EHR system for HIPAA compliance. Verify:
482
- 1. Patient data encryption (at rest and in transit)
483
- 2. Access control and audit logging
484
- 3. Data retention policies
485
- 4. Breach notification capabilities
486
- 5. Business Associate Agreement requirements
487
-
488
- [Include EHR codebase]
489
-
490
- ### Assistant:
491
- """
492
- ```
493
-
494
- **Model Response:** Detailed compliance mapping, gap analysis, and remediation roadmap.
495
-
496
- ---
497
-
498
- ### 3. Supply Chain Security Analysis
499
-
500
- Analyze third-party dependencies and integrations:
501
-
502
- ```python
503
- prompt = """### User:
504
- Perform a supply chain security analysis of this application:
505
- 1. Third-party library vulnerabilities
506
- 2. Dependency confusion risks
507
- 3. Code injection via dependencies
508
- 4. Malicious package detection
509
- 5. License compliance issues
510
-
511
- [Include package.json, requirements.txt, go.mod]
512
-
513
- ### Assistant:
514
- """
515
- ```
516
-
517
- **Model Response:** Comprehensive supply chain risk assessment with mitigation strategies.
518
-
519
- ---
520
-
521
- ### 4. Advanced Penetration Testing Guidance
522
-
523
- Develop sophisticated attack scenarios:
524
-
525
- ```python
526
- prompt = """### User:
527
- Design a comprehensive penetration testing strategy for this enterprise web application. Include:
528
- 1. Attack surface enumeration
529
- 2. Vulnerability prioritization
530
- 3. Multi-stage attack chains
531
- 4. Privilege escalation paths
532
- 5. Data exfiltration scenarios
533
- 6. Post-exploitation persistence
534
-
535
- ### Assistant:
536
- """
537
- ```
538
-
539
- **Model Response:** Professional pentesting methodology with specific attack vectors and validation procedures.
540
-
541
- ---
542
-
543
- ## ⚠️ Limitations & Transparency
544
-
545
- ### What This Model Does Well
546
- ✅ Maximum code understanding and security reasoning
547
- ✅ Complex attack chain analysis and enterprise architecture review
548
- ✅ Detailed operational guidance and compliance mapping
549
- ✅ Sophisticated multi-layered vulnerability detection
550
- ✅ Enterprise-scale codebase analysis
551
- ✅ IBM enterprise trust and reliability
552
-
553
- ### What This Model Doesn't Do
554
- ❌ **Not a security scanner** - Use tools like Semgrep, CodeQL, Snyk, or Veracode
555
- ❌ **Not a penetration testing tool** - Cannot perform active exploitation or network scanning
556
- ❌ **Not legal/compliance advice** - Consult security and legal professionals
557
- ❌ **Not a replacement for security experts** - Critical systems need professional security review and audits
558
- ❌ **Not real-time threat intelligence** - Training data frozen at Dec 2024
559
-
560
- ### Known Issues & Constraints
561
- - **Inference latency:** Larger model means slower responses (35-60 tok/s vs 100+ tok/s for smaller models)
562
- - **Hardware requirements:** Requires enterprise GPU infrastructure (40GB+ VRAM)
563
- - **Detailed analysis:** May generate very comprehensive responses (3000-4000 tokens)
564
- - **Cost consideration:** Higher deployment cost than smaller models
565
- - **Context window:** 8K tokens (vs 128K for Qwen models)
566
-
567
- ### Appropriate Use
568
- ✅ Enterprise security audits and professional assessments
569
- ✅ Regulatory compliance validation
570
- ✅ Critical infrastructure security review
571
- ✅ Financial services and healthcare applications
572
- ✅ Government and defense security analysis
573
-
574
- ### Inappropriate Use
575
- ❌ Sole validation for production deployments (use comprehensive testing)
576
- ❌ Replacement for professional security audits
577
- ❌ Active exploitation or penetration testing without authorization
578
- ❌ Consumer applications (too large, use smaller models)
579
-
580
- ---
581
-
582
- ## 🔬 Dataset Information
583
-
584
- This model was trained on **[SecureCode v2.0](https://huggingface.co/datasets/scthornton/securecode-v2)**, a production-grade security dataset with:
585
-
586
- - **1,209 total examples** (841 train / 175 validation / 193 test)
587
- - **100% incident grounding** - every example tied to real CVEs or security breaches
588
- - **11 vulnerability categories** - complete OWASP Top 10:2025 coverage
589
- - **11 programming languages** - from Python to Rust
590
- - **4-turn conversational structure** - mirrors real developer-AI workflows
591
- - **100% expert validation** - reviewed by independent security professionals
592
-
593
- See the [full dataset card](https://huggingface.co/datasets/scthornton/securecode-v2) and [research paper](https://perfecxion.ai/articles/securecode-v2-dataset-paper.html) for complete details.
594
-
595
- ---
596
-
597
- ## 🏢 About perfecXion.ai
598
-
599
- [perfecXion.ai](https://perfecxion.ai) is dedicated to advancing AI security through research, datasets, and production-grade security tooling.
600
-
601
- **Connect:**
602
- - Website: [perfecxion.ai](https://perfecxion.ai)
603
- - Research: [perfecxion.ai/research](https://perfecxion.ai/research)
604
- - Knowledge Hub: [perfecxion.ai/knowledge](https://perfecxion.ai/knowledge)
605
- - GitHub: [@scthornton](https://github.com/scthornton)
606
- - HuggingFace: [@scthornton](https://huggingface.co/scthornton)
607
- - Email: scott@perfecxion.ai
608
-
609
- ---
610
-
611
- ## 📄 License
612
-
613
- **Model License:** Apache 2.0 (permissive - use in commercial applications)
614
- **Dataset License:** CC BY-NC-SA 4.0 (non-commercial with attribution)
615
-
616
- ### What You CAN Do
617
- ✅ Use this model commercially in production applications
618
- ✅ Fine-tune further for your specific use case
619
- ✅ Deploy in enterprise environments
620
- ✅ Integrate into commercial products
621
- ✅ Distribute and modify the model weights
622
- ✅ Charge for services built on this model
623
- ✅ Use in government and regulated industries
624
-
625
- ### What You CANNOT Do with the Dataset
626
- ❌ Sell or redistribute the raw SecureCode v2.0 dataset commercially
627
- ❌ Use the dataset to train commercial models without releasing under the same license
628
- ❌ Remove attribution or claim ownership of the dataset
629
-
630
- For commercial dataset licensing or custom training, contact: scott@perfecxion.ai
631
-
632
- ---
633
-
634
- ## 📚 Citation
635
-
636
- If you use this model in your research or applications, please cite:
637
 
638
  ```bibtex
639
- @misc{thornton2025securecode-granite20b,
640
- title={IBM Granite 20B Code - SecureCode Edition},
641
- author={Thornton, Scott},
642
- year={2025},
643
- publisher={perfecXion.ai},
644
- url={https://huggingface.co/scthornton/granite-20b-code-securecode},
645
- note={Fine-tuned on SecureCode v2.0: https://huggingface.co/datasets/scthornton/securecode-v2}
646
- }
647
-
648
- @misc{thornton2025securecode-dataset,
649
- title={SecureCode v2.0: A Production-Grade Dataset for Training Security-Aware Code Generation Models},
650
  author={Thornton, Scott},
651
- year={2025},
652
- month={January},
653
  publisher={perfecXion.ai},
654
- url={https://perfecxion.ai/articles/securecode-v2-dataset-paper.html},
655
- note={Dataset: https://huggingface.co/datasets/scthornton/securecode-v2}
656
  }
657
  ```
658
 
659
- ---
660
-
661
- ## 🙏 Acknowledgments
662
-
663
- - **IBM Research** for the exceptional Granite code models and enterprise commitment
664
- - **OWASP Foundation** for maintaining the Top 10 vulnerability taxonomy
665
- - **MITRE Corporation** for the CVE database and vulnerability research
666
- - **Security research community** for responsible disclosure practices
667
- - **Hugging Face** for model hosting and inference infrastructure
668
- - **Enterprise security teams** who validated this model in production environments
669
-
670
- ---
671
-
672
- ## 🤝 Contributing
673
-
674
- Found a security issue or have suggestions for improvement?
675
-
676
- - 🐛 **Report issues:** [GitHub Issues](https://github.com/scthornton/securecode-models/issues)
677
- - 💬 **Discuss improvements:** [HuggingFace Discussions](https://huggingface.co/scthornton/granite-20b-code-securecode/discussions)
678
- - 📧 **Contact:** scott@perfecxion.ai
679
-
680
- ### Community Contributions Welcome
681
 
682
- Especially interested in:
683
- - **Enterprise deployment case studies**
684
- - **Benchmark evaluations** on industry security datasets
685
- - **Compliance validation** (PCI-DSS, HIPAA, SOC 2)
686
- - **Performance optimization** for specific enterprise hardware
687
- - **Integration examples** with enterprise security platforms
688
 
689
- ---
690
-
691
- ## 🔗 SecureCode Model Collection
692
-
693
- Explore other SecureCode fine-tuned models optimized for different use cases:
694
-
695
- ### Entry-Level Models (3-7B)
696
- - **[llama-3.2-3b-securecode](https://huggingface.co/scthornton/llama-3.2-3b-securecode)**
697
- - **Best for:** Consumer hardware, IDE integration, education
698
- - **Hardware:** 8GB RAM minimum
699
- - **Unique strength:** Most accessible
700
-
701
- - **[deepseek-coder-6.7b-securecode](https://huggingface.co/scthornton/deepseek-coder-6.7b-securecode)**
702
- - **Best for:** Security-optimized baseline
703
- - **Hardware:** 16GB RAM
704
- - **Unique strength:** Security-first architecture
705
-
706
- - **[qwen2.5-coder-7b-securecode](https://huggingface.co/scthornton/qwen2.5-coder-7b-securecode)**
707
- - **Best for:** Best code understanding in 7B class
708
- - **Hardware:** 16GB RAM
709
- - **Unique strength:** 128K context, best-in-class
710
-
711
- - **[codegemma-7b-securecode](https://huggingface.co/scthornton/codegemma-7b-securecode)**
712
- - **Best for:** Google ecosystem, instruction following
713
- - **Hardware:** 16GB RAM
714
- - **Unique strength:** Google brand, strong completion
715
-
716
- ### Mid-Range Models (13-15B)
717
- - **[codellama-13b-securecode](https://huggingface.co/scthornton/codellama-13b-securecode)**
718
- - **Best for:** Enterprise trust, Meta brand
719
- - **Hardware:** 24GB RAM
720
- - **Unique strength:** Proven track record
721
-
722
- - **[qwen2.5-coder-14b-securecode](https://huggingface.co/scthornton/qwen2.5-coder-14b-securecode)**
723
- - **Best for:** Advanced code analysis
724
- - **Hardware:** 32GB RAM
725
- - **Unique strength:** 128K context window
726
-
727
- - **[starcoder2-15b-securecode](https://huggingface.co/scthornton/starcoder2-15b-securecode)**
728
- - **Best for:** Multi-language projects (600+ languages)
729
- - **Hardware:** 32GB RAM
730
- - **Unique strength:** Broadest language support
731
-
732
- ### Enterprise-Scale Models (20B+)
733
- - **[granite-20b-code-securecode](https://huggingface.co/scthornton/granite-20b-code-securecode)** ⭐ (YOU ARE HERE)
734
- - **Best for:** Enterprise-scale, IBM trust, maximum capability
735
- - **Hardware:** 48GB RAM
736
- - **Unique strength:** Largest model, deepest analysis
737
 
738
- **View Complete Collection:** [SecureCode Models](https://huggingface.co/collections/scthornton/securecode)
739
-
740
- ---
741
-
742
- <div align="center">
743
-
744
- **Built with ❤️ for secure enterprise software**
745
-
746
- [perfecXion.ai](https://perfecxion.ai) | [Research](https://perfecxion.ai/research) | [Knowledge Hub](https://perfecxion.ai/knowledge) | [Contact](mailto:scott@perfecxion.ai)
747
-
748
- ---
749
-
750
- *Maximum security intelligence. Enterprise trust. IBM heritage.*
751
-
752
- </div>
 
2
  license: apache-2.0
3
  base_model: ibm-granite/granite-20b-code-instruct-8k
4
  tags:
5
+ - security
6
+ - cybersecurity
7
+ - secure-coding
8
+ - ai-security
9
+ - owasp
10
+ - code-generation
11
+ - qlora
12
+ - lora
13
+ - fine-tuned
14
+ - securecode
15
  datasets:
16
+ - scthornton/securecode
17
+ library_name: peft
 
 
18
  pipeline_tag: text-generation
19
+ language:
20
+ - code
21
+ - en
22
  ---
23
 
24
+ # Granite 20B Code SecureCode
25
 
26
  <div align="center">
27
 
28
+ ![Parameters](https://img.shields.io/badge/params-20B-blue.svg)
29
+ ![Dataset](https://img.shields.io/badge/dataset-2,185_examples-green.svg)
30
+ ![OWASP](https://img.shields.io/badge/OWASP-Top_10_2021_+_LLM_Top_10_2025-orange.svg)
31
+ ![Method](https://img.shields.io/badge/method-QLoRA_4--bit-purple.svg)
 
 
32
 
33
+ **Security-specialized code model fine-tuned on the [SecureCode](https://huggingface.co/datasets/scthornton/securecode) dataset**
34
 
35
+ [Dataset](https://huggingface.co/datasets/scthornton/securecode) | [Paper (arXiv:2512.18542)](https://arxiv.org/abs/2512.18542) | [Model Collection](https://huggingface.co/collections/scthornton/securecode) | [perfecXion.ai](https://perfecxion.ai)
36
 
37
  </div>
38
 
39
  ---
40
 
41
+ ## What This Model Does
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
+ This model generates **secure code** when developers ask about building features. Instead of producing vulnerable implementations (like 45% of AI-generated code does), it:
44
 
45
+ - Identifies the security risks in common coding patterns
46
+ - Provides vulnerable *and* secure implementations side by side
47
+ - Explains how attackers would exploit the vulnerability
48
+ - Includes defense-in-depth guidance: logging, monitoring, SIEM integration, infrastructure hardening
49
 
50
+ The model was fine-tuned on **2,185 security training examples** covering both traditional web security (OWASP Top 10 2021) and AI/ML security (OWASP LLM Top 10 2025).
51
 
52
+ ## Model Details
 
 
 
 
53
 
54
+ | | |
55
+ |---|---|
56
+ | **Base Model** | [Granite 20B Code Instruct 8K](https://huggingface.co/ibm-granite/granite-20b-code-instruct-8k) |
57
+ | **Parameters** | 20B |
58
+ | **Architecture** | GPT-BigCode |
59
+ | **Tier** | Tier 4: XL Model |
60
+ | **Method** | QLoRA (4-bit NormalFloat quantization) |
61
+ | **LoRA Rank** | 8 (alpha=16) |
62
+ | **Target Modules** | `attn.c_attn, attn.c_proj, mlp.c_fc, mlp.c_proj` (4 modules) |
63
+ | **Training Data** | [scthornton/securecode](https://huggingface.co/datasets/scthornton/securecode) (2,185 examples) |
64
+ | **Hardware** | NVIDIA A100 40GB |
65
 
66
+ Largest model in the collection. IBM's enterprise-grade code model with 8K context. Deepest security reasoning capabilities.
67
 
68
+ ## Quick Start
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69
 
70
  ```python
 
71
  from peft import PeftModel
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72
  from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
73
+ import torch
74
 
75
+ # Load with 4-bit quantization (matches training)
76
  bnb_config = BitsAndBytesConfig(
77
  load_in_4bit=True,
 
78
  bnb_4bit_quant_type="nf4",
79
+ bnb_4bit_compute_dtype=torch.bfloat16,
80
  )
81
 
82
+ base_model = AutoModelForCausalLM.from_pretrained(
83
  "ibm-granite/granite-20b-code-instruct-8k",
84
  quantization_config=bnb_config,
85
  device_map="auto",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
86
  )
87
+ tokenizer = AutoTokenizer.from_pretrained("scthornton/granite-20b-code-securecode")
88
+ model = PeftModel.from_pretrained(base_model, "scthornton/granite-20b-code-securecode")
89
 
90
+ # Ask a security-relevant coding question
91
+ messages = [
92
+ {"role": "user", "content": "How do I implement JWT authentication with refresh tokens in Python?"}
93
+ ]
94
 
95
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
96
+ outputs = model.generate(inputs, max_new_tokens=2048, temperature=0.7)
97
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
98
  ```
99
 
100
+ ## Training Details
 
 
101
 
102
+ ### Dataset
103
 
104
+ Trained on the full **[SecureCode](https://huggingface.co/datasets/scthornton/securecode)** unified dataset:
 
 
 
 
 
105
 
106
+ - **2,185 total examples** (1,435 web security + 750 AI/ML security)
107
+ - **20 vulnerability categories** across OWASP Top 10 2021 and OWASP LLM Top 10 2025
108
+ - **12+ programming languages** and **49+ frameworks**
109
+ - **4-turn conversational structure**: feature request, vulnerable/secure implementations, advanced probing, operational guidance
110
+ - **100% incident grounding**: every example tied to real CVEs, vendor advisories, or published attack research
111
 
112
+ ### Hyperparameters
 
 
 
 
113
 
114
+ | Parameter | Value |
115
+ |-----------|-------|
116
+ | LoRA rank | 8 |
117
+ | LoRA alpha | 16 |
118
+ | LoRA dropout | 0.05 |
119
+ | Target modules | 4 linear layers |
120
+ | Quantization | 4-bit NormalFloat (NF4) |
121
+ | Learning rate | 2e-4 |
122
+ | LR scheduler | Cosine with 100-step warmup |
123
+ | Epochs | 3 |
124
+ | Per-device batch size | 1 |
125
+ | Gradient accumulation | 16x |
126
+ | Effective batch size | 16 |
127
+ | Max sequence length | 2048 tokens |
128
+ | Optimizer | paged_adamw_8bit |
129
+ | Precision | bf16 |
130
 
131
+ **Notes:** Reduced LoRA rank (8) and max sequence length (2048) for A100 40GB memory. Gradient checkpointing with `use_reentrant=False`. Max gradient norm 1.0.
 
 
 
 
 
 
 
 
132
 
133
+ ## Security Coverage
134
 
135
+ ### Web Security (1,435 examples)
136
 
137
+ OWASP Top 10 2021: Broken Access Control, Cryptographic Failures, Injection, Insecure Design, Security Misconfiguration, Vulnerable Components, Authentication Failures, Software Integrity Failures, Logging/Monitoring Failures, SSRF.
 
 
 
 
 
138
 
139
+ Languages: Python, JavaScript, Java, Go, PHP, C#, TypeScript, Ruby, Rust, Kotlin, YAML.
 
 
 
 
140
 
141
+ ### AI/ML Security (750 examples)
 
 
 
142
 
143
+ OWASP LLM Top 10 2025: Prompt Injection, Sensitive Information Disclosure, Supply Chain Vulnerabilities, Data/Model Poisoning, Improper Output Handling, Excessive Agency, System Prompt Leakage, Vector/Embedding Weaknesses, Misinformation, Unbounded Consumption.
 
 
 
144
 
145
+ Frameworks: LangChain, OpenAI, Anthropic, HuggingFace, LlamaIndex, ChromaDB, Pinecone, FastAPI, Flask, vLLM, CrewAI, and 30+ more.
146
 
147
+ ## SecureCode Model Collection
148
 
149
+ This model is part of the **SecureCode** collection of 8 security-specialized models:
150
 
151
+ | Model | Base | Size | Tier | HuggingFace |
152
+ |-------|------|------|------|-------------|
153
+ | Llama 3.2 SecureCode | meta-llama/Llama-3.2-3B-Instruct | 3B | Accessible | [`llama-3.2-3b-securecode`](https://huggingface.co/scthornton/llama-3.2-3b-securecode) |
154
+ | Qwen2.5 Coder SecureCode | Qwen/Qwen2.5-Coder-7B-Instruct | 7B | Mid-size | [`qwen2.5-coder-7b-securecode`](https://huggingface.co/scthornton/qwen2.5-coder-7b-securecode) |
155
+ | DeepSeek Coder SecureCode | deepseek-ai/deepseek-coder-6.7b-instruct | 6.7B | Mid-size | [`deepseek-coder-6.7b-securecode`](https://huggingface.co/scthornton/deepseek-coder-6.7b-securecode) |
156
+ | CodeGemma SecureCode | google/codegemma-7b-it | 7B | Mid-size | [`codegemma-7b-securecode`](https://huggingface.co/scthornton/codegemma-7b-securecode) |
157
+ | CodeLlama SecureCode | codellama/CodeLlama-13b-Instruct-hf | 13B | Large | [`codellama-13b-securecode`](https://huggingface.co/scthornton/codellama-13b-securecode) |
158
+ | Qwen2.5 Coder 14B SecureCode | Qwen/Qwen2.5-Coder-14B-Instruct | 14B | Large | [`qwen2.5-coder-14b-securecode`](https://huggingface.co/scthornton/qwen2.5-coder-14b-securecode) |
159
+ | StarCoder2 SecureCode | bigcode/starcoder2-15b-instruct-v0.1 | 15B | Large | [`starcoder2-15b-securecode`](https://huggingface.co/scthornton/starcoder2-15b-securecode) |
160
+ | Granite 20B Code SecureCode | ibm-granite/granite-20b-code-instruct-8k | 20B | XL | [`granite-20b-code-securecode`](https://huggingface.co/scthornton/granite-20b-code-securecode) |
161
 
162
+ Choose based on your deployment constraints: **3B** for edge/mobile, **7B** for general use, **13B-15B** for deeper reasoning, **20B** for maximum capability.
163
 
164
+ ## SecureCode Dataset Family
 
 
 
 
 
 
 
 
 
 
 
 
 
165
 
166
+ | Dataset | Examples | Focus | Link |
167
+ |---------|----------|-------|------|
168
+ | **SecureCode** | 2,185 | Unified (web + AI/ML) | [scthornton/securecode](https://huggingface.co/datasets/scthornton/securecode) |
169
+ | SecureCode Web | 1,435 | Web security (OWASP Top 10 2021) | [scthornton/securecode-web](https://huggingface.co/datasets/scthornton/securecode-web) |
170
+ | SecureCode AI/ML | 750 | AI/ML security (OWASP LLM Top 10 2025) | [scthornton/securecode-aiml](https://huggingface.co/datasets/scthornton/securecode-aiml) |
171
 
172
+ ## Intended Use
173
 
174
+ **Use this model for:**
175
+ - Training AI coding assistants to write secure code
176
+ - Security education and training
177
+ - Vulnerability research and secure code review
178
+ - Building security-aware development tools
179
 
180
+ **Do not use this model for:**
181
+ - Offensive exploitation or automated attack generation
182
+ - Circumventing security controls
183
+ - Any activity that violates the base model's license
184
 
185
+ ## Citation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
186
 
187
  ```bibtex
188
+ @misc{thornton2026securecode,
189
+ title={SecureCode: A Production-Grade Multi-Turn Dataset for Training Security-Aware Code Generation Models},
 
 
 
 
 
 
 
 
 
190
  author={Thornton, Scott},
191
+ year={2026},
 
192
  publisher={perfecXion.ai},
193
+ url={https://huggingface.co/datasets/scthornton/securecode},
194
+ note={arXiv:2512.18542}
195
  }
196
  ```
197
 
198
+ ## Links
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
199
 
200
+ - **Dataset**: [scthornton/securecode](https://huggingface.co/datasets/scthornton/securecode)
201
+ - **Research Paper**: [arXiv:2512.18542](https://arxiv.org/abs/2512.18542)
202
+ - **Model Collection**: [huggingface.co/collections/scthornton/securecode](https://huggingface.co/collections/scthornton/securecode)
203
+ - **Author**: [perfecXion.ai](https://perfecxion.ai)
 
 
204
 
205
+ ## License
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
206
 
207
+ This model is released under the **apache-2.0** license (inherited from the base model). The training dataset ([SecureCode](https://huggingface.co/datasets/scthornton/securecode)) is licensed under **CC BY-NC-SA 4.0**.