File size: 3,891 Bytes
c91f28b
8a9a43b
 
c91f28b
 
 
 
8a9a43b
 
 
 
67c67fe
 
 
 
c91f28b
 
8a9a43b
 
c91f28b
8a9a43b
 
c91f28b
542d4d6
c91f28b
542d4d6
c91f28b
542d4d6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c91f28b
542d4d6
c91f28b
542d4d6
c91f28b
542d4d6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c91f28b
 
 
542d4d6
 
 
 
 
 
 
 
 
 
 
c91f28b
8a9a43b
 
 
 
 
 
 
 
 
c91f28b
542d4d6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67c67fe
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
---
library_name: peft
license: llama3.1
base_model: meta-llama/Meta-Llama-3.1-8B
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: cvss_base_score-2025-06-28_12.55.51
  results: []
datasets:
- drorrabin/cvss_base_score-data
language:
- en
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/drorrabin/cvss_base_score/runs/2a8pl3jv)
# cvss_base_score-2025-06-28_12.55.51

# ๐Ÿ›ก๏ธ CVSS v3 Base Score Estimation Model

This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) designed to **predict CVSS v3 base scores** based on vulnerability descriptions.

---

## ๐Ÿ” Model Details

- **Base Model:** Meta-Llama 3.1 8B (4-bit QLoRA fine-tuning)  
- **Task:** Regression-style score prediction (0.0 to 10.0)  
- **Output Format:** The model generates a numeric CVSS base score as part of its response  
- **Quantization:** 4-bit using QLoRA for memory-efficient fine-tuning  
- **LoRA Config:**  
  - `r = 32`  
  - `alpha = 64`  
  - `target_modules = ["q_proj", "v_proj", "k_proj", "o_proj"]`  
  - `dropout = 0.1`  

---

## ๐Ÿ“ฆ Intended Use

This model is intended for assisting security analysts, vulnerability management platforms, or automated tools to **estimate the CVSS v3 base score** given a detailed vulnerability description.

### Example Prompt

What is the CVSS v3 base score of the following vulnerability

CVE Description: admin/limits.php in Dolibarr 7.0.2 allows HTML injection, as demonstrated by the MAIN_MAX_DECIMALS_TOT parameter.

Weakness Type: CWE-79 (Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting'))

Affected Product: dolibarr_erp/crm

Reported by: cve@mitre.org in 2022

The CVSS v3 base score is

### ------------------------------------------------- 
The model is expected to output the score (e.g., `5.4`).
### ------------------------------------------------- 




### Framework versions

- PEFT 0.15.2
- Transformers 4.52.4
- Pytorch 2.6.0+cu124
- Datasets 3.6.0
- Tokenizers 0.21.2




---

## ๐Ÿ“Š Training Details

- **Dataset:** Crafted dataset of CVE descriptions and corresponding CVSS v3 base scores  
- **Training Framework:** Hugging Face Transformers, TRL's `SFTTrainer`, PEFT with QLoRA  
- **Hardware:** Colab with 4-bit quantization for efficient resource usage 


#### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 1
- seed: 42
- optimizer: Use OptimizerNames.PAGED_ADAMW with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 1

---

#### โš ๏ธ Limitations

- The model **does not perform strict numerical regression** โ€” it generates a number as text  
- May produce invalid outputs if the prompt is incomplete or malformed  
- Should not be relied upon as the sole authority for CVSS scoring โ€” use as an assistive tool only  

---

## โœ… How to Use

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "your-hf-username/your-model-name"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = "What is the CVSS v3 base score of the following vulnerability\n\nCVE Description: Example vulnerability ...\nThe CVSS v3 base score is "

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))