CIRCL
/

vulnerability-severity-classification-roberta-base

@@ -1,67 +1,50 @@
 ---
 library_name: transformers
-license: cc-by-4.0
 base_model: roberta-base
-metrics:
-- accuracy
 tags:
 - generated_from_trainer
-- text-classification
-- classification
-- nlp
-- vulnerability
 model-index:
 - name: vulnerability-severity-classification-roberta-base
   results: []
-datasets:
-- CIRCL/vulnerability-scores
 ---
-# VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification
-# Severity classification
-This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the dataset [CIRCL/vulnerability-scores](https://huggingface.co/datasets/CIRCL/vulnerability-scores).
-The model was presented in the paper [VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification](https://huggingface.co/papers/2507.03607) [[arXiv](https://arxiv.org/abs/2507.03607)].
-**Abstract:** VLAI is a transformer-based model that predicts software vulnerability severity levels directly from text descriptions. Built on RoBERTa, VLAI is fine-tuned on over 600,000 real-world vulnerabilities and achieves over 82% accuracy in predicting severity categories, enabling faster and more consistent triage ahead of manual CVSS scoring. The model and dataset are open-source and integrated into the Vulnerability-Lookup service.
-You can read [this page](https://www.vulnerability-lookup.org/user-manual/ai/) for more information.
 ## Model description
-It is a classification model and is aimed to assist in classifying vulnerabilities by severity based on their descriptions.
-## How to get started with the model
-```python
-from transformers import AutoModelForSequenceClassification, AutoTokenizer
-import torch
-labels = ["low", "medium", "high", "critical"]
-model_name = "CIRCL/vulnerability-severity-classification-roberta-base"
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-model = AutoModelForSequenceClassification.from_pretrained(model_name)
-model.eval()
-test_description = "SAP NetWeaver Visual Composer Metadata Uploader is not protected with a proper authorization, allowing unauthenticated agent to upload potentially malicious executable binaries \
-that could severely harm the host system. This could significantly affect the confidentiality, integrity, and availability of the targeted system."
-inputs = tokenizer(test_description, return_tensors="pt", truncation=True, padding=True)
-# Run inference
-with torch.no_grad():
-    outputs = model(**inputs)
-    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
-# Print results
-print("Predictions:", predictions)
-predicted_class = torch.argmax(predictions, dim=-1).item()
-print("Predicted severity:", labels[predicted_class])
-```
 ## Training procedure
@@ -76,37 +59,20 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: linear
 - num_epochs: 5
-It achieves the following results on the evaluation set:
-- Loss: 2.0132
-- Accuracy: 0.8191
-- F1 Macro: 0.7488
-- Low Precision: 0.6601
-- Low Recall: 0.5006
-- Low F1: 0.5694
-- Medium Precision: 0.8440
-- Medium Recall: 0.8767
-- Medium F1: 0.8601
-- High Precision: 0.8195
-- High Recall: 0.8112
-- High F1: 0.8153
-- Critical Precision: 0.7618
-- Critical Recall: 0.7392
-- Critical F1: 0.7503
 ### Training results
 | Training Loss | Epoch | Step  | Validation Loss | Accuracy | F1 Macro | Low Precision | Low Recall | Low F1 | Medium Precision | Medium Recall | Medium F1 | High Precision | High Recall | High F1 | Critical Precision | Critical Recall | Critical F1 |
 |:-------------:|:-----:|:-----:|:---------------:|:--------:|:--------:|:-------------:|:----------:|:------:|:----------------:|:-------------:|:---------:|:--------------:|:-----------:|:-------:|:------------------:|:---------------:|:-----------:|
-| 2.3936        | 1.0   | 16180 | 2.5423          | 0.7404   | 0.6271   | 0.6925        | 0.2372     | 0.3534 | 0.7777           | 0.8359        | 0.8057    | 0.7237         | 0.7233      | 0.7235  | 0.6416             | 0.6110          | 0.6259      |
-| 2.5847        | 2.0   | 32360 | 2.2926          | 0.7674   | 0.6790   | 0.6162        | 0.3880     | 0.4762 | 0.7899           | 0.8604        | 0.8237    | 0.7640         | 0.7458      | 0.7548  | 0.7115             | 0.6175          | 0.6612      |
-| 2.0935        | 3.0   | 48540 | 2.1257          | 0.7920   | 0.7086   | 0.6727        | 0.4017     | 0.5030 | 0.8166           | 0.8670        | 0.8411    | 0.7907         | 0.7774      | 0.7840  | 0.7206             | 0.6927          | 0.7064      |
-| 1.4077        | 4.0   | 64720 | 2.0427          | 0.8080   | 0.7367   | 0.5928        | 0.5203     | 0.5542 | 0.8334           | 0.8691        | 0.8509    | 0.8127         | 0.7952      | 0.8038  | 0.7583             | 0.7185          | 0.7379      |
-| 1.0097        | 5.0   | 80900 | 2.0132          | 0.8191   | 0.7488   | 0.6601        | 0.5006     | 0.5694 | 0.8440           | 0.8767        | 0.8601    | 0.8195         | 0.8112      | 0.8153  | 0.7618             | 0.7392          | 0.7503      |
 ### Framework versions
-- Transformers 5.6.2
 - Pytorch 2.11.0+cu130
 - Datasets 4.8.5
 - Tokenizers 0.22.2

 ---
 library_name: transformers
+license: mit
 base_model: roberta-base
 tags:
 - generated_from_trainer
+metrics:
+- accuracy
 model-index:
 - name: vulnerability-severity-classification-roberta-base
   results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# vulnerability-severity-classification-roberta-base
+This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.0179
+- Accuracy: 0.8203
+- F1 Macro: 0.7483
+- Low Precision: 0.6598
+- Low Recall: 0.4907
+- Low F1: 0.5629
+- Medium Precision: 0.8438
+- Medium Recall: 0.8822
+- Medium F1: 0.8626
+- High Precision: 0.8198
+- High Recall: 0.8047
+- High F1: 0.8122
+- Critical Precision: 0.7674
+- Critical Recall: 0.7439
+- Critical F1: 0.7554
 ## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
 ## Training procedure
 - lr_scheduler_type: linear
 - num_epochs: 5
 ### Training results
 | Training Loss | Epoch | Step  | Validation Loss | Accuracy | F1 Macro | Low Precision | Low Recall | Low F1 | Medium Precision | Medium Recall | Medium F1 | High Precision | High Recall | High F1 | Critical Precision | Critical Recall | Critical F1 |
 |:-------------:|:-----:|:-----:|:---------------:|:--------:|:--------:|:-------------:|:----------:|:------:|:----------------:|:-------------:|:---------:|:--------------:|:-----------:|:-------:|:------------------:|:---------------:|:-----------:|
+| 2.6338        | 1.0   | 16220 | 2.5259          | 0.7429   | 0.6380   | 0.6452        | 0.2938     | 0.4037 | 0.7868           | 0.8384        | 0.8118    | 0.7182         | 0.7249      | 0.7215  | 0.6455             | 0.5872          | 0.6150      |
+| 2.3481        | 2.0   | 32440 | 2.2993          | 0.7686   | 0.6867   | 0.5788        | 0.4162     | 0.4842 | 0.8058           | 0.8473        | 0.8261    | 0.7796         | 0.7221      | 0.7498  | 0.6507             | 0.7276          | 0.6870      |
+| 1.9554        | 3.0   | 48660 | 2.1519          | 0.7943   | 0.7158   | 0.6368        | 0.4363     | 0.5178 | 0.8375           | 0.8490        | 0.8432    | 0.7789         | 0.7897      | 0.7843  | 0.7126             | 0.7229          | 0.7177      |
+| 1.7953        | 4.0   | 64880 | 2.0104          | 0.8098   | 0.7311   | 0.7150        | 0.4207     | 0.5297 | 0.8262           | 0.8871        | 0.8556    | 0.8173         | 0.7780      | 0.7972  | 0.7406             | 0.7430          | 0.7418      |
+| 1.2463        | 5.0   | 81100 | 2.0179          | 0.8203   | 0.7483   | 0.6598        | 0.4907     | 0.5629 | 0.8438           | 0.8822        | 0.8626    | 0.8198         | 0.8047      | 0.8122  | 0.7674             | 0.7439          | 0.7554      |
 ### Framework versions
+- Transformers 5.7.0
 - Pytorch 2.11.0+cu130
 - Datasets 4.8.5
 - Tokenizers 0.22.2

config.json CHANGED Viewed

@@ -34,7 +34,7 @@
   "pad_token_id": 1,
   "problem_type": "single_label_classification",
   "tie_word_embeddings": true,
-  "transformers_version": "5.6.2",
   "type_vocab_size": 1,
   "use_cache": true,
   "vocab_size": 50265

   "pad_token_id": 1,
   "problem_type": "single_label_classification",
   "tie_word_embeddings": true,
+  "transformers_version": "5.7.0",
   "type_vocab_size": 1,
   "use_cache": true,
   "vocab_size": 50265

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8456ce885cec0af5ecdb77b8fcde82fe54354fc66e98935f9db4611770aa5b22
 size 498618976

 version https://git-lfs.github.com/spec/v1
+oid sha256:941ee2bc68da4303786a8c8c96f1332c9f859d47e3386a64a86f7a335974050a
 size 498618976