cedricbonhomme commited on
Commit
be23667
·
verified ·
1 Parent(s): d36e9bd

End of training

Browse files
Files changed (3) hide show
  1. README.md +25 -43
  2. emissions.csv +1 -1
  3. model.safetensors +1 -1
README.md CHANGED
@@ -1,55 +1,37 @@
1
-
2
  ---
3
- base_model: hfl/chinese-macbert-base
4
- datasets:
5
- - CIRCL/Vulnerability-CNVD
6
  library_name: transformers
7
  license: apache-2.0
8
- metrics:
9
- - accuracy
10
  tags:
11
  - generated_from_trainer
12
- - text-classification
13
- - classification
14
- - nlp
15
- - chinese
16
- - vulnerability
17
- pipeline_tag: text-classification
18
- language: zh
19
  model-index:
20
  - name: vulnerability-severity-classification-chinese-macbert-base
21
  results: []
22
  ---
23
 
24
- # VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification (Chinese Text)
25
-
26
- This model is a fine-tuned version of [hfl/chinese-macbert-base](https://huggingface.co/hfl/chinese-macbert-base) on the dataset [CIRCL/Vulnerability-CNVD](https://huggingface.co/datasets/CIRCL/Vulnerability-CNVD).
27
 
28
- For more information, visit the [Vulnerability-Lookup project page](https://vulnerability.circl.lu) or the [ML-Gateway GitHub repository](https://github.com/vulnerability-lookup/ML-Gateway), which demonstrates its usage in a FastAPI server.
29
 
 
30
  It achieves the following results on the evaluation set:
 
 
31
 
32
- - Loss: 0.6172
33
- - Accuracy: 0.7817
34
 
35
- ## How to use
36
 
37
- You can use this model directly with the Hugging Face `transformers` library for text classification:
38
 
39
- ```python
40
- from transformers import pipeline
41
 
42
- classifier = pipeline(
43
- "text-classification",
44
- model="CIRCL/vulnerability-severity-classification-chinese-macbert-base"
45
- )
46
 
47
- # Example usage for a Chinese vulnerability description
48
- description_chinese = "TOTOLINK A3600R是中国吉翁电子(TOTOLINK)公司的一款6天线1200M无线路由器。TOTOLINK A3600R存在缓冲区溢出漏洞,该漏洞源于/cgi-bin/cstecgi.cgi文件的UploadCustomModule函数中的File参数未能正确验证输入数据的长度大小,攻击者可利用该漏洞在系统上执行任意代码或者导致拒绝服务。"
49
- result_chinese = classifier(description_chinese)
50
- print(result_chinese)
51
- # Expected output example: [{'label': 'High', 'score': 0.9644894003868103}]
52
- ```
53
 
54
  ## Training procedure
55
 
@@ -60,7 +42,7 @@ The following hyperparameters were used during training:
60
  - train_batch_size: 32
61
  - eval_batch_size: 32
62
  - seed: 42
63
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
64
  - lr_scheduler_type: linear
65
  - num_epochs: 5
66
 
@@ -68,16 +50,16 @@ The following hyperparameters were used during training:
68
 
69
  | Training Loss | Epoch | Step | Validation Loss | Accuracy |
70
  |:-------------:|:-----:|:-----:|:---------------:|:--------:|
71
- | 0.6329 | 1.0 | 3412 | 0.5832 | 0.7546 |
72
- | 0.5215 | 2.0 | 6824 | 0.5531 | 0.7750 |
73
- | 0.4827 | 3.0 | 10236 | 0.5521 | 0.7768 |
74
- | 0.3448 | 4.0 | 13648 | 0.5822 | 0.7814 |
75
- | 0.3865 | 5.0 | 17060 | 0.6172 | 0.7817 |
76
 
77
 
78
  ### Framework versions
79
 
80
- - Transformers 4.51.3
81
- - Pytorch 2.7.1+cu126
82
- - Datasets 3.6.0
83
- - Tokenizers 0.21.1
 
 
1
  ---
 
 
 
2
  library_name: transformers
3
  license: apache-2.0
4
+ base_model: hfl/chinese-macbert-base
 
5
  tags:
6
  - generated_from_trainer
7
+ metrics:
8
+ - accuracy
 
 
 
 
 
9
  model-index:
10
  - name: vulnerability-severity-classification-chinese-macbert-base
11
  results: []
12
  ---
13
 
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
 
16
 
17
+ # vulnerability-severity-classification-chinese-macbert-base
18
 
19
+ This model is a fine-tuned version of [hfl/chinese-macbert-base](https://huggingface.co/hfl/chinese-macbert-base) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.6282
22
+ - Accuracy: 0.7798
23
 
24
+ ## Model description
 
25
 
26
+ More information needed
27
 
28
+ ## Intended uses & limitations
29
 
30
+ More information needed
 
31
 
32
+ ## Training and evaluation data
 
 
 
33
 
34
+ More information needed
 
 
 
 
 
35
 
36
  ## Training procedure
37
 
 
42
  - train_batch_size: 32
43
  - eval_batch_size: 32
44
  - seed: 42
45
+ - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
46
  - lr_scheduler_type: linear
47
  - num_epochs: 5
48
 
 
50
 
51
  | Training Loss | Epoch | Step | Validation Loss | Accuracy |
52
  |:-------------:|:-----:|:-----:|:---------------:|:--------:|
53
+ | 0.5427 | 1.0 | 3447 | 0.6015 | 0.7505 |
54
+ | 0.5167 | 2.0 | 6894 | 0.5665 | 0.7747 |
55
+ | 0.365 | 3.0 | 10341 | 0.5643 | 0.7846 |
56
+ | 0.3289 | 4.0 | 13788 | 0.5923 | 0.7777 |
57
+ | 0.3408 | 5.0 | 17235 | 0.6282 | 0.7798 |
58
 
59
 
60
  ### Framework versions
61
 
62
+ - Transformers 4.56.1
63
+ - Pytorch 2.8.0+cu128
64
+ - Datasets 4.0.0
65
+ - Tokenizers 0.22.0
emissions.csv CHANGED
@@ -1,2 +1,2 @@
1
  timestamp,project_name,run_id,experiment_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud,pue
2
- 2025-07-16T08:11:24,codecarbon,fdc7d841-f907-4035-9903-95cdd678e97f,5b0fa12a-3dd7-45bb-9766-cc326314d9f1,5853.097029601224,0.1108801330521463,1.894383990072016e-05,42.5,397.9241498104493,94.34468364715576,0.06905461438023941,0.8310216200945035,0.15328660956491294,1.0533628440396559,Luxembourg,LUX,luxembourg,,,Linux-6.8.0-60-generic-x86_64-with-glibc2.39,3.12.3,2.8.4,64,AMD EPYC 9124 16-Core Processor,2,2 x NVIDIA L40S,6.1294,49.6113,251.58582305908203,machine,N,1.0
 
1
  timestamp,project_name,run_id,experiment_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud,pue
2
+ 2025-09-23T07:11:38,codecarbon,1f17e608-4fa9-4987-90dd-91fd7affc58e,5b0fa12a-3dd7-45bb-9766-cc326314d9f1,3959.345899205655,0.07726378551046184,1.951428025673708e-05,42.5,400.0415744241799,94.34468507766725,0.04671087107853902,0.5836095666095105,0.10368662567193301,0.7340070633599824,Luxembourg,LUX,luxembourg,,,Linux-6.8.0-71-generic-x86_64-with-glibc2.39,3.12.3,2.8.4,64,AMD EPYC 9124 16-Core Processor,2,2 x NVIDIA L40S,6.1294,49.6113,251.5858268737793,machine,N,1.0
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:09bdd0c4b2f1aa5da816128034c1769e76cf0451ef930d3a9af704cd04b6716c
3
  size 409103316
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b5b737ef2520b7b509dd4d553e95d2652ac77c8600701104c281c6f2a708fe7
3
  size 409103316