|
|
--- |
|
|
library_name: peft |
|
|
license: llama3.1 |
|
|
base_model: meta-llama/Meta-Llama-3.1-8B |
|
|
tags: |
|
|
- trl |
|
|
- sft |
|
|
- generated_from_trainer |
|
|
model-index: |
|
|
- name: cvss_base_score-2025-06-28_12.55.51 |
|
|
results: [] |
|
|
datasets: |
|
|
- drorrabin/cvss_base_score-data |
|
|
language: |
|
|
- en |
|
|
--- |
|
|
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
|
|
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/drorrabin/cvss_base_score/runs/2a8pl3jv) |
|
|
# cvss_base_score-2025-06-28_12.55.51 |
|
|
|
|
|
# π‘οΈ CVSS v3 Base Score Estimation Model |
|
|
|
|
|
This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) designed to **predict CVSS v3 base scores** based on vulnerability descriptions. |
|
|
|
|
|
--- |
|
|
|
|
|
## π Model Details |
|
|
|
|
|
- **Base Model:** Meta-Llama 3.1 8B (4-bit QLoRA fine-tuning) |
|
|
- **Task:** Regression-style score prediction (0.0 to 10.0) |
|
|
- **Output Format:** The model generates a numeric CVSS base score as part of its response |
|
|
- **Quantization:** 4-bit using QLoRA for memory-efficient fine-tuning |
|
|
- **LoRA Config:** |
|
|
- `r = 32` |
|
|
- `alpha = 64` |
|
|
- `target_modules = ["q_proj", "v_proj", "k_proj", "o_proj"]` |
|
|
- `dropout = 0.1` |
|
|
|
|
|
--- |
|
|
|
|
|
## π¦ Intended Use |
|
|
|
|
|
This model is intended for assisting security analysts, vulnerability management platforms, or automated tools to **estimate the CVSS v3 base score** given a detailed vulnerability description. |
|
|
|
|
|
### Example Prompt |
|
|
|
|
|
What is the CVSS v3 base score of the following vulnerability |
|
|
|
|
|
CVE Description: admin/limits.php in Dolibarr 7.0.2 allows HTML injection, as demonstrated by the MAIN_MAX_DECIMALS_TOT parameter. |
|
|
|
|
|
Weakness Type: CWE-79 (Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')) |
|
|
|
|
|
Affected Product: dolibarr_erp/crm |
|
|
|
|
|
Reported by: cve@mitre.org in 2022 |
|
|
|
|
|
The CVSS v3 base score is |
|
|
|
|
|
### ------------------------------------------------- |
|
|
The model is expected to output the score (e.g., `5.4`). |
|
|
### ------------------------------------------------- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Framework versions |
|
|
|
|
|
- PEFT 0.15.2 |
|
|
- Transformers 4.52.4 |
|
|
- Pytorch 2.6.0+cu124 |
|
|
- Datasets 3.6.0 |
|
|
- Tokenizers 0.21.2 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
--- |
|
|
|
|
|
## π Training Details |
|
|
|
|
|
- **Dataset:** Crafted dataset of CVE descriptions and corresponding CVSS v3 base scores |
|
|
- **Training Framework:** Hugging Face Transformers, TRL's `SFTTrainer`, PEFT with QLoRA |
|
|
- **Hardware:** Colab with 4-bit quantization for efficient resource usage |
|
|
|
|
|
|
|
|
#### Training hyperparameters |
|
|
|
|
|
The following hyperparameters were used during training: |
|
|
- learning_rate: 0.0001 |
|
|
- train_batch_size: 4 |
|
|
- eval_batch_size: 1 |
|
|
- seed: 42 |
|
|
- optimizer: Use OptimizerNames.PAGED_ADAMW with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments |
|
|
- lr_scheduler_type: cosine |
|
|
- lr_scheduler_warmup_ratio: 0.03 |
|
|
- num_epochs: 1 |
|
|
|
|
|
--- |
|
|
|
|
|
#### β οΈ Limitations |
|
|
|
|
|
- The model **does not perform strict numerical regression** β it generates a number as text |
|
|
- May produce invalid outputs if the prompt is incomplete or malformed |
|
|
- Should not be relied upon as the sole authority for CVSS scoring β use as an assistive tool only |
|
|
|
|
|
--- |
|
|
|
|
|
## β
How to Use |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
model_id = "your-hf-username/your-model-name" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForCausalLM.from_pretrained(model_id) |
|
|
|
|
|
prompt = "What is the CVSS v3 base score of the following vulnerability\n\nCVE Description: Example vulnerability ...\nThe CVSS v3 base score is " |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_new_tokens=10) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |