|
|
--- |
|
|
library_name: transformers |
|
|
license: cc-by-nc-4.0 |
|
|
tags: |
|
|
- code-review |
|
|
- security-analysis |
|
|
- static-analysis |
|
|
- python |
|
|
- code-quality |
|
|
- peft |
|
|
- qlora |
|
|
- fine-tuned |
|
|
- sql-injection |
|
|
- vulnerability-detection |
|
|
- python-security |
|
|
- code-optimization |
|
|
pipeline_tag: text-generation |
|
|
datasets: |
|
|
- alenphilip/Code-Review-Assistant |
|
|
- alenphilip/Code-Review-Assistant-Eval |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- rouge |
|
|
- bleu |
|
|
base_model: |
|
|
- Qwen/Qwen2.5-7B-Instruct |
|
|
--- |
|
|
|
|
|
# Code Review Assistant Model |
|
|
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
|
|
A specialized Python code review assistant fine-tuned for security analysis, performance optimization, and Pythonic code quality. The model identifies security vulnerabilities, performance issues, and provides corrected code examples with detailed explanations specifically for Python codebases. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
This model is a fine-tuned version of Qwen2.5-7B-Instruct, specifically optimized for Python code analysis. It excels at detecting security vulnerabilities, performance bottlenecks, and code quality issues while providing actionable fixes with corrected code examples. |
|
|
|
|
|
- **Developed by:** Alen Philip |
|
|
- **Model type:** Causal Language Model |
|
|
- **Language(s) (NLP):** English, with specialized Python code understanding |
|
|
- **License:** cc-by-nc-4.0 |
|
|
- **Finetuned from model:** Qwen/Qwen2.5-7B-Instruct |
|
|
- **Supported Languages:** Python only |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- **Repository:** [Hugging Face Hub](https://huggingface.co/alenphilip/Code_Review_Assistant_Model) |
|
|
- **Base Model:** [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) |
|
|
- **Training Dataset:** [Code Review Dataset](https://huggingface.co/datasets/alenphilip/Code-Review-Assistant) |
|
|
- **Evaluation Dataset** [Code Review(Eval) Dataset](https://huggingface.co/datasets/alenphilip/Code-Review-Assistant-Eval) |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
This model is specifically designed for: |
|
|
- Automated Python code review in development pipelines |
|
|
- Security vulnerability detection in Python code |
|
|
- Python code quality assessment and improvement suggestions |
|
|
- Performance optimization recommendations for Python applications |
|
|
- Educational purposes for learning Python best practices |
|
|
- Integration into Python IDEs and code editors |
|
|
|
|
|
### Downstream Use |
|
|
|
|
|
The model can be integrated into: |
|
|
- CI/CD pipelines for automated Python code review |
|
|
- Python code quality monitoring tools |
|
|
- Security scanning platforms for Python applications |
|
|
- Educational platforms for Python programming |
|
|
- Code review assistance tools for Python developers |
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
- Analysis of non-Python programming languages |
|
|
- Non-code related text generation |
|
|
- Legal or compliance advice |
|
|
- Production deployment without human validation |
|
|
- Real-time security monitoring without additional safeguards |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
- **Language Specificity:** Only trained on Python code - will not perform well on other programming languages |
|
|
- **False Positives/Negatives:** May occasionally miss edge cases or flag non-issues |
|
|
- **Training Data Bias:** Reflects patterns and conventions present in the training dataset |
|
|
- **Security Critical Systems:** Should not be sole security measure for critical systems |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
Users should: |
|
|
- Always validate model suggestions with human review |
|
|
- Use as assistant tool rather than autonomous system |
|
|
- Test suggested fixes thoroughly before deployment |
|
|
- Combine with other security scanning tools for critical applications |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
import torch |
|
|
|
|
|
model_name = "alenphilip/Code_Review_Assistant_Model" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_name, |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto", |
|
|
trust_remote_code=True |
|
|
) |
|
|
|
|
|
# Example usage for code review |
|
|
def review_python_code(code_snippet): |
|
|
messages = [ |
|
|
{"role": "system", "content": "You are a helpful AI assistant specialized in code review and security analysis."}, |
|
|
{"role": "user", "content": f"Review this Python code and provide improvements with fixed code:\n\n```python\n{code_snippet}\n```"} |
|
|
] |
|
|
|
|
|
text = tokenizer.apply_chat_template( |
|
|
messages, |
|
|
tokenize=False, |
|
|
add_generation_prompt=False |
|
|
) |
|
|
|
|
|
inputs = tokenizer(text, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.1) |
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
|
|
return response |
|
|
|
|
|
# Test with vulnerable code |
|
|
vulnerable_code = ''' |
|
|
def get_user_by_email(email): |
|
|
query = "SELECT * FROM users WHERE email = '" + email + "'" |
|
|
cursor.execute(query) |
|
|
return cursor.fetchone() |
|
|
''' |
|
|
|
|
|
result = review_python_code(vulnerable_code) |
|
|
print(result) |
|
|
``` |
|
|
#### OR |
|
|
```python |
|
|
# Use a pipeline as a high-level helper |
|
|
from transformers import pipeline |
|
|
pipe = pipeline("text-generation", model="alenphilip/Code_Review_Assistant_Model") |
|
|
prompt = "Review this Python code and provide improvements with fixed code:\n\n```python\nclass LockManager:\n def __init__(self, lock1, lock2):\n self.lock1 = lock1\n self.lock2 = lock2\n\n def acquire_both(self):\n self.lock1.acquire()\n self.lock2.acquire() # This might fail\n\n def release_both(self):\n self.lock1.release()\n self.lock2.release()\n```" |
|
|
messages = [ |
|
|
{"role": "system", "content": "You are a helpful AI assistant specialized in code review and security analysis."}, |
|
|
{"role": "user", "content": prompt}, |
|
|
] |
|
|
result = pipe(messages) |
|
|
conversation = result[0]['generated_text'] |
|
|
|
|
|
for message in conversation: |
|
|
print(f"\n{message['role'].upper()}:") |
|
|
print("-" * 50) |
|
|
print(message['content']) |
|
|
print() |
|
|
|
|
|
print("=" * 70) |
|
|
``` |
|
|
# Training Details |
|
|
## Training Data |
|
|
The model was trained on a comprehensive dataset of Python code review examples covering: |
|
|
|
|
|
### 🔐 SECURITY |
|
|
- SQL Injection Prevention |
|
|
- XSS Prevention in Web Frameworks |
|
|
- Authentication Bypass Vulnerabilities |
|
|
- Insecure Deserialization |
|
|
- Command Injection Prevention |
|
|
- JWT Token Security |
|
|
- Hardcoded Secrets Detection |
|
|
- Input Validation & Sanitization |
|
|
- Secure File Upload Handling |
|
|
- Broken Access Control |
|
|
- Password Hashing & Storage |
|
|
|
|
|
### ⚡ PERFORMANCE |
|
|
- Algorithm Complexity Optimization |
|
|
- Database Query Optimization |
|
|
- Memory Leak Detection |
|
|
- I/O Bound Operations Optimization |
|
|
- CPU Bound Operations Optimization |
|
|
- Async/Await Performance |
|
|
- Caching Strategies Implementation |
|
|
- Loop Optimization Techniques |
|
|
- Data Structure Selection |
|
|
- Concurrent Execution Patterns |
|
|
|
|
|
### 🐍 PYTHONIC CODE |
|
|
|
|
|
- Type Hinting Implementation |
|
|
- Mutable Default Arguments |
|
|
- Context Manager Usage |
|
|
- Decorator Best Practices |
|
|
- List/Dict/Set Comprehensions |
|
|
- Class Design Principles |
|
|
- Dunder Method Implementation |
|
|
- Property Decorator Usage |
|
|
- Generator Expressions |
|
|
- Class vs Static Methods |
|
|
- Import Organization |
|
|
- Exception Handling & Hierarchy |
|
|
- EAFP vs LBYL Patterns |
|
|
- Basic syntax validation |
|
|
- Variable scope validation |
|
|
- Type Operation Compatibility |
|
|
|
|
|
### 🔧 PRODUCTION RELIABILITY |
|
|
|
|
|
- Error Handling and Logging |
|
|
|
|
|
## Training Procedure |
|
|
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/alenphilip2071-google/huggingface/runs/d27nrifd) |
|
|
### Training Hyperparameters |
|
|
- **Training regime:** bf16 mixed precision with SFT & QLoRA |
|
|
- **Base Model:** Qwen2.5-7B-Instruct |
|
|
- **LoRA Rank:** 32 |
|
|
- **LoRA Alpha:** 64 |
|
|
- **LoRA Dropout:** 0.1 |
|
|
- **Learning Rate:** 2e-4 |
|
|
- **Batch Size:** 16 (with gradient accumulation 4) |
|
|
- **Epochs:** 2 |
|
|
- **Max Sequence Length:** 2048 tokens |
|
|
- **Optimizer:** Paged AdamW 8-bit |
|
|
|
|
|
### Speeds, Sizes, Times |
|
|
- **Base Model Size:** 7B parameters |
|
|
- **Adapter Size:** ~45MB |
|
|
- **Training Time:** ~68 minutes for 400 steps |
|
|
- **Training Examples:** 13,670 training, 1,726 evaluation |
|
|
|
|
|
## Evaluation |
|
|
### Metrics |
|
|
- **ROUGE-L:** 0.754 |
|
|
- **BLEU:** 61.99 |
|
|
- **Validation Loss:** 0.595 |
|
|
|
|
|
## Results |
|
|
The model achieved strong performance on code review tasks, particularly excelling at: |
|
|
- Security vulnerability detection (SQL injection, XSS, etc.) |
|
|
- Pythonic code improvements |
|
|
- Performance optimization suggestions |
|
|
- Providing corrected code examples |
|
|
|
|
|
## Summary |
|
|
The model demonstrates excellent capability in identifying and fixing common Python code issues, with particular strength in security vulnerability detection and code quality improvements. |
|
|
|
|
|
## Environmental Impact |
|
|
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact/#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). |
|
|
- Hardware Type: NVIDIA H100 80GB VRAM |
|
|
- Hours used: ~1.5 hours |
|
|
- Training Approach: QLoRA for efficient fine-tuning |
|
|
|
|
|
## Technical Specifications |
|
|
### Model Architecture and Objective |
|
|
- **Architecture:** Transformer-based causal language model |
|
|
- **Objective:** Supervised fine-tuning for code review tasks |
|
|
- **Context Window:** 32K tokens (base model) |
|
|
|
|
|
### Compute Infrastructure |
|
|
**Hardware** |
|
|
- Training performed on GPU cluster with NVIDIA H100 80GB VRAM |
|
|
|
|
|
**Software** |
|
|
- Transformers, PEFT, TRL, BitsAndBytes |
|
|
- QLoRA for parameter-efficient fine-tuning |
|
|
|
|
|
## Citation |
|
|
```bibtex |
|
|
@misc{alen_philip_george_2025, |
|
|
author = {Alen Philip George}, |
|
|
title = {Code_Review_Assistant_Model (Revision 233d438)}, |
|
|
year = 2025, |
|
|
url = {https://huggingface.co/alenphilip/Code_Review_Assistant_Model}, |
|
|
doi = {10.57967/hf/6836}, |
|
|
publisher = {Hugging Face} |
|
|
} |
|
|
``` |
|
|
## Model Card Authors |
|
|
Alen Philip George |
|
|
|
|
|
## Model Card Contact |
|
|
Hugging Face: [alenphilip](https://huggingface.co/alenphilip) |
|
|
LinkedIn: [alenphilipgeorge](https://linkedin.com/in/alen-philip-george-130226254) |
|
|
Email: [alenphilipgeorge@gmail.com](mailto:alenphilipgeorge@gmail.com) |
|
|
|
|
|
|
|
|
For questions about this model, please use the Hugging Face model repository discussions or contact via the above channels. |