cvss_score_pred / README.md
drorrabin's picture
Update README.md
67c67fe verified
---
library_name: peft
license: llama3.1
base_model: meta-llama/Meta-Llama-3.1-8B
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: cvss_base_score-2025-06-28_12.55.51
results: []
datasets:
- drorrabin/cvss_base_score-data
language:
- en
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/drorrabin/cvss_base_score/runs/2a8pl3jv)
# cvss_base_score-2025-06-28_12.55.51
# πŸ›‘οΈ CVSS v3 Base Score Estimation Model
This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) designed to **predict CVSS v3 base scores** based on vulnerability descriptions.
---
## πŸ” Model Details
- **Base Model:** Meta-Llama 3.1 8B (4-bit QLoRA fine-tuning)
- **Task:** Regression-style score prediction (0.0 to 10.0)
- **Output Format:** The model generates a numeric CVSS base score as part of its response
- **Quantization:** 4-bit using QLoRA for memory-efficient fine-tuning
- **LoRA Config:**
- `r = 32`
- `alpha = 64`
- `target_modules = ["q_proj", "v_proj", "k_proj", "o_proj"]`
- `dropout = 0.1`
---
## πŸ“¦ Intended Use
This model is intended for assisting security analysts, vulnerability management platforms, or automated tools to **estimate the CVSS v3 base score** given a detailed vulnerability description.
### Example Prompt
What is the CVSS v3 base score of the following vulnerability
CVE Description: admin/limits.php in Dolibarr 7.0.2 allows HTML injection, as demonstrated by the MAIN_MAX_DECIMALS_TOT parameter.
Weakness Type: CWE-79 (Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting'))
Affected Product: dolibarr_erp/crm
Reported by: cve@mitre.org in 2022
The CVSS v3 base score is
### -------------------------------------------------
The model is expected to output the score (e.g., `5.4`).
### -------------------------------------------------
### Framework versions
- PEFT 0.15.2
- Transformers 4.52.4
- Pytorch 2.6.0+cu124
- Datasets 3.6.0
- Tokenizers 0.21.2
---
## πŸ“Š Training Details
- **Dataset:** Crafted dataset of CVE descriptions and corresponding CVSS v3 base scores
- **Training Framework:** Hugging Face Transformers, TRL's `SFTTrainer`, PEFT with QLoRA
- **Hardware:** Colab with 4-bit quantization for efficient resource usage
#### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 1
- seed: 42
- optimizer: Use OptimizerNames.PAGED_ADAMW with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 1
---
#### ⚠️ Limitations
- The model **does not perform strict numerical regression** β€” it generates a number as text
- May produce invalid outputs if the prompt is incomplete or malformed
- Should not be relied upon as the sole authority for CVSS scoring β€” use as an assistive tool only
---
## βœ… How to Use
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "your-hf-username/your-model-name"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
prompt = "What is the CVSS v3 base score of the following vulnerability\n\nCVE Description: Example vulnerability ...\nThe CVSS v3 base score is "
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))