File size: 9,243 Bytes
98c6413 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
---
library_name: peft
license: mit
base_model: microsoft/phi-2
tags:
- base_model:adapter:microsoft/phi-2
- lora
- transformers
metrics:
- accuracy
model-index:
- name: phi2-lora-malicious-classifier
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# phi2-lora-malicious-classifier
This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4627
- Accuracy: 0.8476
- Precision Weighted: 0.8428
- Recall Weighted: 0.8476
- F1 Weighted: 0.8440
- Mcc: 0.7515
- Balanced Accuracy: 0.7994
- Per Class: {'jailbreaking': {'TP': 259, 'FP': 97, 'FN': 130, 'TN': 1443, 'FNR': 0.3341902313624679, 'FPR': 0.06298701298701298, 'Specificity': 0.937012987012987}, 'prompt injection': {'TP': 434, 'FP': 97, 'FN': 136, 'TN': 1262, 'FNR': 0.23859649122807017, 'FPR': 0.07137601177336277, 'Specificity': 0.9286239882266373}, 'unharmful': {'TP': 942, 'FP': 100, 'FN': 28, 'TN': 859, 'FNR': 0.0288659793814433, 'FPR': 0.10427528675703858, 'Specificity': 0.8957247132429614}}
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10
### Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision Weighted | Recall Weighted | F1 Weighted | Mcc | Balanced Accuracy | Per Class |
|:-------------:|:-----:|:-----:|:---------------:|:--------:|:------------------:|:---------------:|:-----------:|:------:|:-----------------:|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
| 0.7931 | 1.0 | 1107 | 0.7796 | 0.6910 | 0.6758 | 0.6910 | 0.6750 | 0.4860 | 0.6086 | {'jailbreaking': {'TP': 148, 'FP': 141, 'FN': 241, 'TN': 1399, 'FNR': 0.6195372750642674, 'FPR': 0.09155844155844156, 'Specificity': 0.9084415584415585}, 'prompt injection': {'TP': 309, 'FP': 145, 'FN': 261, 'TN': 1214, 'FNR': 0.45789473684210524, 'FPR': 0.10669610007358352, 'Specificity': 0.8933038999264165}, 'unharmful': {'TP': 876, 'FP': 310, 'FN': 94, 'TN': 649, 'FNR': 0.09690721649484536, 'FPR': 0.3232533889468196, 'Specificity': 0.6767466110531803}} |
| 0.5884 | 2.0 | 2214 | 0.5420 | 0.8015 | 0.7916 | 0.8015 | 0.7899 | 0.6742 | 0.7269 | {'jailbreaking': {'TP': 188, 'FP': 94, 'FN': 201, 'TN': 1446, 'FNR': 0.5167095115681234, 'FPR': 0.06103896103896104, 'Specificity': 0.938961038961039}, 'prompt injection': {'TP': 411, 'FP': 97, 'FN': 159, 'TN': 1262, 'FNR': 0.2789473684210526, 'FPR': 0.07137601177336277, 'Specificity': 0.9286239882266373}, 'unharmful': {'TP': 947, 'FP': 192, 'FN': 23, 'TN': 767, 'FNR': 0.023711340206185566, 'FPR': 0.20020855057351408, 'Specificity': 0.799791449426486}} |
| 0.462 | 3.0 | 3321 | 0.5065 | 0.8186 | 0.8101 | 0.8186 | 0.8101 | 0.7026 | 0.7531 | {'jailbreaking': {'TP': 208, 'FP': 90, 'FN': 181, 'TN': 1450, 'FNR': 0.4652956298200514, 'FPR': 0.05844155844155844, 'Specificity': 0.9415584415584416}, 'prompt injection': {'TP': 430, 'FP': 103, 'FN': 140, 'TN': 1256, 'FNR': 0.24561403508771928, 'FPR': 0.07579102281089035, 'Specificity': 0.9242089771891097}, 'unharmful': {'TP': 941, 'FP': 157, 'FN': 29, 'TN': 802, 'FNR': 0.029896907216494847, 'FPR': 0.16371220020855057, 'Specificity': 0.8362877997914494}} |
| 0.4606 | 4.0 | 4428 | 0.4729 | 0.8305 | 0.8248 | 0.8305 | 0.8267 | 0.7233 | 0.7784 | {'jailbreaking': {'TP': 242, 'FP': 105, 'FN': 147, 'TN': 1435, 'FNR': 0.37789203084832906, 'FPR': 0.06818181818181818, 'Specificity': 0.9318181818181818}, 'prompt injection': {'TP': 430, 'FP': 119, 'FN': 140, 'TN': 1240, 'FNR': 0.24561403508771928, 'FPR': 0.0875643855776306, 'Specificity': 0.9124356144223694}, 'unharmful': {'TP': 930, 'FP': 103, 'FN': 40, 'TN': 856, 'FNR': 0.041237113402061855, 'FPR': 0.10740354535974973, 'Specificity': 0.8925964546402503}} |
| 0.4217 | 5.0 | 5535 | 0.4688 | 0.8351 | 0.8295 | 0.8351 | 0.8311 | 0.7309 | 0.7831 | {'jailbreaking': {'TP': 245, 'FP': 104, 'FN': 144, 'TN': 1436, 'FNR': 0.37017994858611825, 'FPR': 0.06753246753246753, 'Specificity': 0.9324675324675324}, 'prompt injection': {'TP': 430, 'FP': 106, 'FN': 140, 'TN': 1253, 'FNR': 0.24561403508771928, 'FPR': 0.07799852832965416, 'Specificity': 0.9220014716703459}, 'unharmful': {'TP': 936, 'FP': 108, 'FN': 34, 'TN': 851, 'FNR': 0.03505154639175258, 'FPR': 0.11261730969760167, 'Specificity': 0.8873826903023984}} |
| 0.445 | 6.0 | 6642 | 0.4465 | 0.8434 | 0.8390 | 0.8434 | 0.8402 | 0.7449 | 0.7953 | {'jailbreaking': {'TP': 259, 'FP': 107, 'FN': 130, 'TN': 1433, 'FNR': 0.3341902313624679, 'FPR': 0.06948051948051948, 'Specificity': 0.9305194805194805}, 'prompt injection': {'TP': 428, 'FP': 98, 'FN': 142, 'TN': 1261, 'FNR': 0.24912280701754386, 'FPR': 0.07211184694628403, 'Specificity': 0.927888153053716}, 'unharmful': {'TP': 940, 'FP': 97, 'FN': 30, 'TN': 862, 'FNR': 0.030927835051546393, 'FPR': 0.10114702815432743, 'Specificity': 0.8988529718456726}} |
| 0.3633 | 7.0 | 7749 | 0.4604 | 0.8460 | 0.8409 | 0.8460 | 0.8422 | 0.7487 | 0.7968 | {'jailbreaking': {'TP': 253, 'FP': 90, 'FN': 136, 'TN': 1450, 'FNR': 0.3496143958868895, 'FPR': 0.05844155844155844, 'Specificity': 0.9415584415584416}, 'prompt injection': {'TP': 440, 'FP': 106, 'FN': 130, 'TN': 1253, 'FNR': 0.22807017543859648, 'FPR': 0.07799852832965416, 'Specificity': 0.9220014716703459}, 'unharmful': {'TP': 939, 'FP': 101, 'FN': 31, 'TN': 858, 'FNR': 0.031958762886597936, 'FPR': 0.10531803962460896, 'Specificity': 0.894681960375391}} |
| 0.3249 | 8.0 | 8856 | 0.4670 | 0.8450 | 0.8408 | 0.8450 | 0.8417 | 0.7475 | 0.7976 | {'jailbreaking': {'TP': 262, 'FP': 106, 'FN': 127, 'TN': 1434, 'FNR': 0.3264781491002571, 'FPR': 0.06883116883116883, 'Specificity': 0.9311688311688312}, 'prompt injection': {'TP': 427, 'FP': 93, 'FN': 143, 'TN': 1266, 'FNR': 0.25087719298245614, 'FPR': 0.0684326710816777, 'Specificity': 0.9315673289183223}, 'unharmful': {'TP': 941, 'FP': 100, 'FN': 29, 'TN': 859, 'FNR': 0.029896907216494847, 'FPR': 0.10427528675703858, 'Specificity': 0.8957247132429614}} |
| 0.3672 | 9.0 | 9963 | 0.4610 | 0.8471 | 0.8421 | 0.8471 | 0.8434 | 0.7505 | 0.7980 | {'jailbreaking': {'TP': 255, 'FP': 96, 'FN': 134, 'TN': 1444, 'FNR': 0.3444730077120823, 'FPR': 0.06233766233766234, 'Specificity': 0.9376623376623376}, 'prompt injection': {'TP': 438, 'FP': 99, 'FN': 132, 'TN': 1260, 'FNR': 0.23157894736842105, 'FPR': 0.0728476821192053, 'Specificity': 0.9271523178807947}, 'unharmful': {'TP': 941, 'FP': 100, 'FN': 29, 'TN': 859, 'FNR': 0.029896907216494847, 'FPR': 0.10427528675703858, 'Specificity': 0.8957247132429614}} |
| 0.4548 | 10.0 | 11070 | 0.4627 | 0.8476 | 0.8428 | 0.8476 | 0.8440 | 0.7515 | 0.7994 | {'jailbreaking': {'TP': 259, 'FP': 97, 'FN': 130, 'TN': 1443, 'FNR': 0.3341902313624679, 'FPR': 0.06298701298701298, 'Specificity': 0.937012987012987}, 'prompt injection': {'TP': 434, 'FP': 97, 'FN': 136, 'TN': 1262, 'FNR': 0.23859649122807017, 'FPR': 0.07137601177336277, 'Specificity': 0.9286239882266373}, 'unharmful': {'TP': 942, 'FP': 100, 'FN': 28, 'TN': 859, 'FNR': 0.0288659793814433, 'FPR': 0.10427528675703858, 'Specificity': 0.8957247132429614}} |
### Framework versions
- PEFT 0.17.1
- Transformers 4.53.3
- Pytorch 2.6.0+cu124
- Datasets 4.3.0
- Tokenizers 0.21.4 |