CATastrophe / README.md
ewhk9887's picture
Upload README.md with huggingface_hub
5f7e3fc verified
---
language: en
tags:
- vulnerability-detection
- code-analysis
- autoencoder
- anomaly-detection
library_name: pytorch
metrics:
- mse
---
# CATastrophe - Code Vulnerability Detector
This model is an autoencoder-based vulnerability detector for Python code. It uses TF-IDF
vectorization and an autoencoder architecture to detect anomalies in code that may indicate
vulnerabilities.
## Model Details
- **Architecture**: Autoencoder (Input → 512 → 128 → 512 → Input)
- **Input Features**: 2000 (TF-IDF)
- **Training Loss**: 0.0005
- **Framework**: PyTorch
## Usage
```python
import torch
import pickle
from model import Autoencoder
# Load model
model = Autoencoder(input_dim=2000)
model.load_state_dict(torch.load('catastrophe_model.pth'))
model.eval()
# Load vectorizer
with open('vectorizer.pkl', 'rb') as f:
vectorizer = pickle.load(f)
# Analyze code
code_text = "your code here"
features = vectorizer.transform([code_text]).toarray()
features_tensor = torch.tensor(features, dtype=torch.float32)
with torch.no_grad():
reconstructed = model(features_tensor)
anomaly_score = torch.mean((features_tensor - reconstructed) ** 2, dim=1)
```
## Training Configuration
- Batch Size: 256
- Epochs: 50
- Learning Rate: 0.001
- Optimizer: Adam
## Limitations
This model is trained on vulnerable commits only and uses reconstruction error as an
anomaly score. High scores indicate potential vulnerabilities, but manual review is
recommended.