drorrabin commited on
Commit
542d4d6
·
verified ·
1 Parent(s): b0d1351

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -15
README.md CHANGED
@@ -17,23 +17,72 @@ should probably proofread and complete it, then remove this comment. -->
17
  [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/drorrabin/cvss_base_score/runs/2a8pl3jv)
18
  # cvss_base_score-2025-06-28_12.55.51
19
 
20
- This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) on an unknown dataset.
21
 
22
- ## Model description
23
 
24
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
- ## Intended uses & limitations
27
 
28
- More information needed
29
 
30
- ## Training and evaluation data
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
- More information needed
33
 
34
- ## Training procedure
35
 
36
- ### Training hyperparameters
 
 
 
 
 
 
 
 
 
 
37
 
38
  The following hyperparameters were used during training:
39
  - learning_rate: 0.0001
@@ -45,10 +94,28 @@ The following hyperparameters were used during training:
45
  - lr_scheduler_warmup_ratio: 0.03
46
  - num_epochs: 1
47
 
48
- ### Framework versions
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
 
50
- - PEFT 0.15.2
51
- - Transformers 4.52.4
52
- - Pytorch 2.6.0+cu124
53
- - Datasets 3.6.0
54
- - Tokenizers 0.21.2
 
17
  [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/drorrabin/cvss_base_score/runs/2a8pl3jv)
18
  # cvss_base_score-2025-06-28_12.55.51
19
 
20
+ # 🛡️ CVSS v3 Base Score Estimation Model
21
 
22
+ This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) designed to **predict CVSS v3 base scores** based on vulnerability descriptions.
23
 
24
+ ---
25
+
26
+ ## 🔍 Model Details
27
+
28
+ - **Base Model:** Meta-Llama 3.1 8B (4-bit QLoRA fine-tuning)
29
+ - **Task:** Regression-style score prediction (0.0 to 10.0)
30
+ - **Output Format:** The model generates a numeric CVSS base score as part of its response
31
+ - **Quantization:** 4-bit using QLoRA for memory-efficient fine-tuning
32
+ - **LoRA Config:**
33
+ - `r = 32`
34
+ - `alpha = 64`
35
+ - `target_modules = ["q_proj", "v_proj", "k_proj", "o_proj"]`
36
+ - `dropout = 0.1`
37
+
38
+ ---
39
+
40
+ ## 📦 Intended Use
41
+
42
+ This model is intended for assisting security analysts, vulnerability management platforms, or automated tools to **estimate the CVSS v3 base score** given a detailed vulnerability description.
43
+
44
+ ### Example Prompt
45
+
46
+ What is the CVSS v3 base score of the following vulnerability
47
+
48
+ CVE Description: admin/limits.php in Dolibarr 7.0.2 allows HTML injection, as demonstrated by the MAIN_MAX_DECIMALS_TOT parameter.
49
+
50
+ Weakness Type: CWE-79 (Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting'))
51
 
52
+ Affected Product: dolibarr_erp/crm
53
 
54
+ Reported by: cve@mitre.org in 2022
55
 
56
+ The CVSS v3 base score is
57
+
58
+ ### -------------------------------------------------
59
+ The model is expected to output the score (e.g., `5.4`).
60
+ ### -------------------------------------------------
61
+
62
+
63
+
64
+
65
+ ### Framework versions
66
+
67
+ - PEFT 0.15.2
68
+ - Transformers 4.52.4
69
+ - Pytorch 2.6.0+cu124
70
+ - Datasets 3.6.0
71
+ - Tokenizers 0.21.2
72
 
 
73
 
 
74
 
75
+
76
+ ---
77
+
78
+ ## 📊 Training Details
79
+
80
+ - **Dataset:** Crafted dataset of CVE descriptions and corresponding CVSS v3 base scores
81
+ - **Training Framework:** Hugging Face Transformers, TRL's `SFTTrainer`, PEFT with QLoRA
82
+ - **Hardware:** Colab with 4-bit quantization for efficient resource usage
83
+
84
+
85
+ #### Training hyperparameters
86
 
87
  The following hyperparameters were used during training:
88
  - learning_rate: 0.0001
 
94
  - lr_scheduler_warmup_ratio: 0.03
95
  - num_epochs: 1
96
 
97
+ ---
98
+
99
+ #### ⚠️ Limitations
100
+
101
+ - The model **does not perform strict numerical regression** — it generates a number as text
102
+ - May produce invalid outputs if the prompt is incomplete or malformed
103
+ - Should not be relied upon as the sole authority for CVSS scoring — use as an assistive tool only
104
+
105
+ ---
106
+
107
+ ## ✅ How to Use
108
+
109
+ ```python
110
+ from transformers import AutoModelForCausalLM, AutoTokenizer
111
+
112
+ model_id = "your-hf-username/your-model-name"
113
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
114
+ model = AutoModelForCausalLM.from_pretrained(model_id)
115
+
116
+ prompt = "What is the CVSS v3 base score of the following vulnerability\n\nCVE Description: Example vulnerability ...\nThe CVSS v3 base score is "
117
+
118
+ inputs = tokenizer(prompt, return_tensors="pt")
119
+ outputs = model.generate(**inputs, max_new_tokens=10)
120
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
121