IntelliSA-220m

IntelliSA-220m is a fine-tuned Salesforce/codet5p-220m model for detecting security vulnerabilities in Infrastructure as Code (IaC) configurations across Chef, Ansible, and Puppet.

Model Details

Base Model: Salesforce/codet5p-220m (220M parameters)
Architecture: T5ForSequenceClassification
Task: Binary classification (secure vs vulnerable)
License: MIT

Performance

Technology	F1 Score
Ansible	0.884
Puppet	0.756
Chef	0.698
Combined	0.779

Usage

from transformers import T5ForSequenceClassification, RobertaTokenizer
import torch

model = T5ForSequenceClassification.from_pretrained("colemei/IntelliSA-220m")
tokenizer = RobertaTokenizer.from_pretrained("colemei/IntelliSA-220m")

THRESHOLD = 0.61  # Classification threshold

def predict_vulnerability(code_snippet):
    inputs = tokenizer(code_snippet, return_tensors="pt", max_length=512,
                      truncation=True, padding=True)

    with torch.no_grad():
        outputs = model(**inputs)
        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)

    score = predictions[0][1].item()
    is_vulnerable = score >= THRESHOLD
    return score, is_vulnerable

# Example
code = """
cookbook_file '/tmp/file' do
  mode '0777'
end
"""
score, is_vulnerable = predict_vulnerability(code)
print(f"Vulnerability score: {score:.3f}, Vulnerable: {is_vulnerable}")

Training Data

Training data is maintained in a separate repository for transparency and reusability:

Dataset Repository: colemei/IntelliSA-dataset
Training Configuration:
- Learning Rate: 4e-5, Batch Size: 8, Epochs: 6, Weight Decay: 0.01
- Framework: Transformers 4.45.2, PyTorch
- Training Data: 2,300 pseudo-labeled instances from Claude-4

For complete dataset information including oracle ground truth and detailed statistics, see the dataset repository.

Citation

PLACEHOLDER

Downloads last month: 4

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for colemei/IntelliSA-220m

Base model

Salesforce/codet5p-220m

Finetuned

(96)

this model

Evaluation results

Combined F1 Score
self-reported

0.779
Ansible F1 Score
self-reported

0.884
Puppet F1 Score
self-reported

0.756
Chef F1 Score
self-reported

0.698