---
library_name: peft
license: llama3.1
base_model: meta-llama/Meta-Llama-3.1-8B
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: cvss_base_score-2025-06-28_12.55.51
  results: []
datasets:
- drorrabin/cvss_base_score-data
language:
- en
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/drorrabin/cvss_base_score/runs/2a8pl3jv)
# cvss_base_score-2025-06-28_12.55.51

# 🛡️ CVSS v3 Base Score Estimation Model

This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) designed to **predict CVSS v3 base scores** based on vulnerability descriptions.

---

## 🔍 Model Details

- **Base Model:** Meta-Llama 3.1 8B (4-bit QLoRA fine-tuning)  
- **Task:** Regression-style score prediction (0.0 to 10.0)  
- **Output Format:** The model generates a numeric CVSS base score as part of its response  
- **Quantization:** 4-bit using QLoRA for memory-efficient fine-tuning  
- **LoRA Config:**  
  - `r = 32`  
  - `alpha = 64`  
  - `target_modules = ["q_proj", "v_proj", "k_proj", "o_proj"]`  
  - `dropout = 0.1`  

---

## 📦 Intended Use

This model is intended for assisting security analysts, vulnerability management platforms, or automated tools to **estimate the CVSS v3 base score** given a detailed vulnerability description.

### Example Prompt

What is the CVSS v3 base score of the following vulnerability

CVE Description: admin/limits.php in Dolibarr 7.0.2 allows HTML injection, as demonstrated by the MAIN_MAX_DECIMALS_TOT parameter.

Weakness Type: CWE-79 (Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting'))

Affected Product: dolibarr_erp/crm

Reported by: cve@mitre.org in 2022

The CVSS v3 base score is

### ------------------------------------------------- 
The model is expected to output the score (e.g., `5.4`).
### ------------------------------------------------- 


### Framework versions

- PEFT 0.15.2
- Transformers 4.52.4
- Pytorch 2.6.0+cu124
- Datasets 3.6.0
- Tokenizers 0.21.2


---

## 📊 Training Details

- **Dataset:** Crafted dataset of CVE descriptions and corresponding CVSS v3 base scores  
- **Training Framework:** Hugging Face Transformers, TRL's `SFTTrainer`, PEFT with QLoRA  
- **Hardware:** Colab with 4-bit quantization for efficient resource usage 


#### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 1
- seed: 42
- optimizer: Use OptimizerNames.PAGED_ADAMW with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 1

---

#### ⚠️ Limitations

- The model **does not perform strict numerical regression** — it generates a number as text  
- May produce invalid outputs if the prompt is incomplete or malformed  
- Should not be relied upon as the sole authority for CVSS scoring — use as an assistive tool only  

---

## ✅ How to Use

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "your-hf-username/your-model-name"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = "What is the CVSS v3 base score of the following vulnerability\n\nCVE Description: Example vulnerability ...\nThe CVSS v3 base score is "

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))