CIRCL
/

vulnerability-severity-classification-roberta-base

@@ -1,67 +1,50 @@
 ---
 library_name: transformers
-license: cc-by-4.0
 base_model: roberta-base
-metrics:
-- accuracy
 tags:
 - generated_from_trainer
-- text-classification
-- classification
-- nlp
-- vulnerability
 model-index:
 - name: vulnerability-severity-classification-roberta-base
   results: []
-datasets:
-- CIRCL/vulnerability-scores
 ---
-# VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification
-# Severity classification
-This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the dataset [CIRCL/vulnerability-scores](https://huggingface.co/datasets/CIRCL/vulnerability-scores).
-The model was presented in the paper [VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification](https://huggingface.co/papers/2507.03607) [[arXiv](https://arxiv.org/abs/2507.03607)].
-**Abstract:** VLAI is a transformer-based model that predicts software vulnerability severity levels directly from text descriptions. Built on RoBERTa, VLAI is fine-tuned on over 600,000 real-world vulnerabilities and achieves over 82% accuracy in predicting severity categories, enabling faster and more consistent triage ahead of manual CVSS scoring. The model and dataset are open-source and integrated into the Vulnerability-Lookup service.
-You can read [this page](https://www.vulnerability-lookup.org/user-manual/ai/) for more information.
 ## Model description
-It is a classification model and is aimed to assist in classifying vulnerabilities by severity based on their descriptions.
-## How to get started with the model
-```python
-from transformers import AutoModelForSequenceClassification, AutoTokenizer
-import torch
-labels = ["low", "medium", "high", "critical"]
-model_name = "CIRCL/vulnerability-severity-classification-roberta-base"
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-model = AutoModelForSequenceClassification.from_pretrained(model_name)
-model.eval()
-test_description = "SAP NetWeaver Visual Composer Metadata Uploader is not protected with a proper authorization, allowing unauthenticated agent to upload potentially malicious executable binaries \
-that could severely harm the host system. This could significantly affect the confidentiality, integrity, and availability of the targeted system."
-inputs = tokenizer(test_description, return_tensors="pt", truncation=True, padding=True)
-# Run inference
-with torch.no_grad():
-    outputs = model(**inputs)
-    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
-# Print results
-print("Predictions:", predictions)
-predicted_class = torch.argmax(predictions, dim=-1).item()
-print("Predicted severity:", labels[predicted_class])
-```
 ## Training procedure
@@ -76,37 +59,20 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: linear
 - num_epochs: 5
-It achieves the following results on the evaluation set:
-- Loss: 2.0179
-- Accuracy: 0.8203
-- F1 Macro: 0.7483
-- Low Precision: 0.6598
-- Low Recall: 0.4907
-- Low F1: 0.5629
-- Medium Precision: 0.8438
-- Medium Recall: 0.8822
-- Medium F1: 0.8626
-- High Precision: 0.8198
-- High Recall: 0.8047
-- High F1: 0.8122
-- Critical Precision: 0.7674
-- Critical Recall: 0.7439
-- Critical F1: 0.7554
 ### Training results
 | Training Loss | Epoch | Step  | Validation Loss | Accuracy | F1 Macro | Low Precision | Low Recall | Low F1 | Medium Precision | Medium Recall | Medium F1 | High Precision | High Recall | High F1 | Critical Precision | Critical Recall | Critical F1 |
 |:-------------:|:-----:|:-----:|:---------------:|:--------:|:--------:|:-------------:|:----------:|:------:|:----------------:|:-------------:|:---------:|:--------------:|:-----------:|:-------:|:------------------:|:---------------:|:-----------:|
-| 2.6338        | 1.0   | 16220 | 2.5259          | 0.7429   | 0.6380   | 0.6452        | 0.2938     | 0.4037 | 0.7868           | 0.8384        | 0.8118    | 0.7182         | 0.7249      | 0.7215  | 0.6455             | 0.5872          | 0.6150      |
-| 2.3481        | 2.0   | 32440 | 2.2993          | 0.7686   | 0.6867   | 0.5788        | 0.4162     | 0.4842 | 0.8058           | 0.8473        | 0.8261    | 0.7796         | 0.7221      | 0.7498  | 0.6507             | 0.7276          | 0.6870      |
-| 1.9554        | 3.0   | 48660 | 2.1519          | 0.7943   | 0.7158   | 0.6368        | 0.4363     | 0.5178 | 0.8375           | 0.8490        | 0.8432    | 0.7789         | 0.7897      | 0.7843  | 0.7126             | 0.7229          | 0.7177      |
-| 1.7953        | 4.0   | 64880 | 2.0104          | 0.8098   | 0.7311   | 0.7150        | 0.4207     | 0.5297 | 0.8262           | 0.8871        | 0.8556    | 0.8173         | 0.7780      | 0.7972  | 0.7406             | 0.7430          | 0.7418      |
-| 1.2463        | 5.0   | 81100 | 2.0179          | 0.8203   | 0.7483   | 0.6598        | 0.4907     | 0.5629 | 0.8438           | 0.8822        | 0.8626    | 0.8198         | 0.8047      | 0.8122  | 0.7674             | 0.7439          | 0.7554      |
 ### Framework versions
-- Transformers 5.7.0
 - Pytorch 2.11.0+cu130
 - Datasets 4.8.5
 - Tokenizers 0.22.2

 ---
 library_name: transformers
+license: mit
 base_model: roberta-base
 tags:
 - generated_from_trainer
+metrics:
+- accuracy
 model-index:
 - name: vulnerability-severity-classification-roberta-base
   results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# vulnerability-severity-classification-roberta-base
+This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.9916
+- Accuracy: 0.8193
+- F1 Macro: 0.7498
+- Low Precision: 0.6797
+- Low Recall: 0.4889
+- Low F1: 0.5687
+- Medium Precision: 0.8483
+- Medium Recall: 0.8715
+- Medium F1: 0.8597
+- High Precision: 0.8133
+- High Recall: 0.8151
+- High F1: 0.8142
+- Critical Precision: 0.7600
+- Critical Recall: 0.7530
+- Critical F1: 0.7565
 ## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
 ## Training procedure
 - lr_scheduler_type: linear
 - num_epochs: 5
 ### Training results
 | Training Loss | Epoch | Step  | Validation Loss | Accuracy | F1 Macro | Low Precision | Low Recall | Low F1 | Medium Precision | Medium Recall | Medium F1 | High Precision | High Recall | High F1 | Critical Precision | Critical Recall | Critical F1 |
 |:-------------:|:-----:|:-----:|:---------------:|:--------:|:--------:|:-------------:|:----------:|:------:|:----------------:|:-------------:|:---------:|:--------------:|:-----------:|:-------:|:------------------:|:---------------:|:-----------:|
+| 2.7154        | 1.0   | 16297 | 2.5179          | 0.7391   | 0.6425   | 0.6191        | 0.3258     | 0.4269 | 0.8206           | 0.7797        | 0.7996    | 0.6765         | 0.7982      | 0.7323  | 0.6778             | 0.5567          | 0.6113      |
+| 2.3960        | 2.0   | 32594 | 2.2502          | 0.7715   | 0.6976   | 0.5951        | 0.4652     | 0.5222 | 0.8261           | 0.8211        | 0.8236    | 0.7427         | 0.7808      | 0.7612  | 0.7020             | 0.6658          | 0.6834      |
+| 2.0492        | 3.0   | 48891 | 2.0960          | 0.7937   | 0.7124   | 0.6940        | 0.4025     | 0.5095 | 0.8109           | 0.8757        | 0.8420    | 0.7940         | 0.7700      | 0.7818  | 0.7395             | 0.6945          | 0.7163      |
+| 1.9126        | 4.0   | 65188 | 1.9977          | 0.8095   | 0.7388   | 0.6468        | 0.4862     | 0.5551 | 0.8441           | 0.8622        | 0.8530    | 0.8055         | 0.7994      | 0.8024  | 0.7330             | 0.7563          | 0.7445      |
+| 1.3893        | 5.0   | 81485 | 1.9916          | 0.8193   | 0.7498   | 0.6797        | 0.4889     | 0.5687 | 0.8483           | 0.8715        | 0.8597    | 0.8133         | 0.8151      | 0.8142  | 0.7600             | 0.7530          | 0.7565      |
 ### Framework versions
+- Transformers 5.8.0
 - Pytorch 2.11.0+cu130
 - Datasets 4.8.5
 - Tokenizers 0.22.2

config.json CHANGED Viewed

@@ -34,7 +34,7 @@
   "pad_token_id": 1,
   "problem_type": "single_label_classification",
   "tie_word_embeddings": true,
-  "transformers_version": "5.7.0",
   "type_vocab_size": 1,
   "use_cache": true,
   "vocab_size": 50265

   "pad_token_id": 1,
   "problem_type": "single_label_classification",
   "tie_word_embeddings": true,
+  "transformers_version": "5.8.0",
   "type_vocab_size": 1,
   "use_cache": true,
   "vocab_size": 50265

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:941ee2bc68da4303786a8c8c96f1332c9f859d47e3386a64a86f7a335974050a
 size 498618976

 version https://git-lfs.github.com/spec/v1
+oid sha256:2429fb28017f9386587bba241fcddb9e3a3a6b57f91d287f7a29cd73540b3efb
 size 498618976