Text Generation
PEFT
Safetensors
English
code
security
lora
qlora
vulnerability-detection
api-security
causal-lm
conversational
Instructions to use harsharajkumar273/api-security-qlora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use harsharajkumar273/api-security-qlora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("codellama/CodeLlama-7b-instruct-hf") model = PeftModel.from_pretrained(base_model, "harsharajkumar273/api-security-qlora") - Notebooks
- Google Colab
- Kaggle
File size: 3,990 Bytes
5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa 5fdd9bf d968faa | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 | ---
language:
- en
license: llama2
base_model: codellama/CodeLlama-7b-instruct-hf
tags:
- code
- security
- peft
- lora
- qlora
- vulnerability-detection
- api-security
- causal-lm
datasets:
- custom
pipeline_tag: text-generation
---
# API Security QLoRA — Code Llama 7B
A QLoRA fine-tuned adapter on top of **CodeLlama-7b-instruct-hf**, trained to detect security vulnerabilities in API endpoint source code. Given a raw code snippet, the model produces a structured analysis identifying vulnerability type, severity, CWE, and a remediated version of the code.
---
## Model Details
| Property | Value |
|---|---|
| **Base Model** | `codellama/CodeLlama-7b-instruct-hf` |
| **Fine-tuning Method** | QLoRA (4-bit NF4 quantization) |
| **LoRA Rank (r)** | 16 |
| **LoRA Alpha** | 32 |
| **LoRA Dropout** | 0.05 |
| **Target Modules** | `q_proj`, `k_proj`, `v_proj`, `o_proj` |
| **Task** | Causal LM / Code Security Analysis |
| **Training Steps** | 531 |
| **Training Hardware** | Google Colab T4 (16GB VRAM) |
---
## Training Data
Fine-tuned on a custom dataset of **10,000 API-specific vulnerability samples** (synthetic + augmented) covering 19 vulnerability types mapped to OWASP API Top 10.
### Language Distribution
| Language | Share | Frameworks |
|---|---|---|
| Python | 46% | Flask, FastAPI, Django |
| JavaScript | 25% | Express.js, NestJS |
| Java | 15% | Spring Boot |
| PHP / Go / Ruby / C# | 14% | Laravel, Gin, Rails, ASP.NET |
### Vulnerability Distribution
| Vulnerability | Samples | CWE |
|---|---|---|
| SQL Injection | 2,425 | CWE-89 |
| Mass Assignment | 1,307 | CWE-915 |
| Path Traversal | 943 | CWE-22 |
| IDOR | 860 | CWE-639 |
| Broken Authorization | 792 | CWE-285 |
| Command Injection | 600 | CWE-78 |
### Severity Breakdown
- **Critical (43%)**: RCE, SQLi, unauthorized admin access
- **High (41%)**: Data leaks, IDOR, authorization bypass
- **Medium / Clean (16%)**: XSS, input validation warnings, baseline clean samples
---
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base_model_id = "codellama/CodeLlama-7b-instruct-hf"
adapter_id = "harsharajkumar273/api-security-qlora"
tokenizer = AutoTokenizer.from_pretrained(adapter_id, use_fast=False)
base = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(base, adapter_id)
model.eval()
code_snippet = """
@app.route('/user/<int:user_id>')
def get_user(user_id):
query = f"SELECT * FROM users WHERE id = {user_id}"
result = db.execute(query)
return jsonify(result)
"""
prompt = f"[INST] Analyze this API endpoint for security vulnerabilities:\n\n{code_snippet} [/INST]"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
---
## Integration with API Security Scanner
This adapter is the default model in the [API Security Scanner](https://github.com/harsharajkumar/api-security) project. It is loaded automatically — no manual path configuration needed:
```bash
git clone https://github.com/harsharajkumar/api-security
cd api-security
pip install -r requirements.txt
streamlit run app.py
```
The scanner will download this adapter from the Hub on first run and cache it locally.
---
## Intended Use
- Automated API security auditing in CI/CD pipelines
- Developer tooling for identifying vulnerable endpoint patterns
- Security research and OWASP API Top 10 education
## Out of Scope
- General-purpose code generation
- Non-API code (UI components, data processing scripts, etc.)
- Production security decisions without human review
---
## Credits
Developed as part of **CS6380 — API Security Project**
**Authors:** Siddhanth Nilesh Jagtap · Tanuj Kenchannavar · Harsha Raj Kumar
|