distilbert-base-uncased-logline-v3
This model is a fine-tuned version of distilbert-base-uncased on the AIT Log Data Set V2.0 dataset1, https://zenodo.org/records/5789064. It achieves the following results on the evaluation set:
- Loss: 0.0022
- Accuracy: 0.9995
- F1: 0.9994
Model description
This model is meant for text classification of log files for network intrusion detection. The python package that runs this model can be found here -> https://github.com/Isaacwilliam4/INSyT. As mentioned on their site, this model was trained on the following logs: Apache access and error logs, authentication logs, DNS logs, VPN logs, audit logs, Suricata logs, network traffic packet captures, horde logs, exim logs, syslog, and system monitoring logs.
Labels
| Label | Label Name |
|---|---|
| 0 | attacker:dnsteal:dnsteal-dropped |
| 1 | attacker:dnsteal:dnsteal-received |
| 2 | attacker:dnsteal:exfiltration-service |
| 3 | attacker_change_user:escalate |
| 4 | attacker_change_user:escalate:escalated_command:escalated_sudo_command |
| 5 | attacker_http:dirb:foothold |
| 6 | attacker_http:foothold:service_scan |
| 7 | attacker_http:foothold:webshell_cmd |
| 8 | attacker_http:foothold:webshell_upload |
| 9 | attacker_http:foothold:wpscan |
| 10 | attacker_vpn:escalate |
| 11 | attacker_vpn:foothold |
| 12 | benign |
| 13 | crack_passwords:escalate |
| 14 | dirb:foothold |
| 15 | dns_scan:foothold |
| 16 | escalate:escalated_command:escalated_sudo_command |
| 17 | escalate:escalated_command:escalated_sudo_command:escalated_sudo_session |
| 18 | escalate:webshell_cmd |
| 19 | foothold:network_scan |
| 20 | foothold:service_scan |
| 21 | foothold:traceroute |
| 22 | foothold:wpscan |
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
|---|---|---|---|---|---|
| 0.0435 | 1.0 | 6274 | 0.0120 | 0.9965 | 0.9965 |
| 0.0059 | 2.0 | 12548 | 0.0032 | 0.9993 | 0.9992 |
| 0.0023 | 3.0 | 18822 | 0.0022 | 0.9995 | 0.9994 |
Test results
| Test Loss | Test Accuracy | Test F1 |
|---|---|---|
| 0.0020 | 0.9994 | 0.9994 |
Five Fold Cross Validation Mean Test Confusion Matrix
Framework versions
- Transformers 4.38.2
- Pytorch 2.0.0+cu117
- Datasets 2.18.0
- Tokenizers 0.15.1
Citations
[1]M. Landauer, F. Skopik, M. Frank, W. Hotwagner, M. Wurzenbergerand A. Rauber, โAIT Log Data Set V2.0โ. Zenodo, Feb. 24, 2022. doi: 10.5281/zenodo.5789064.
- Downloads last month
- 32
Model tree for isaacwilliam4/insyt
Base model
distilbert/distilbert-base-uncased