File size: 13,217 Bytes
4cd178e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
library_name: peft
license: apache-2.0
base_model: TinyLlama/TinyLlama_v1.1
tags:
- base_model:adapter:TinyLlama/TinyLlama_v1.1
- lora
- transformers
metrics:
- accuracy
model-index:
- name: tinyllama-lora-malicious-classifier
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# tinyllama-lora-malicious-classifier

This model is a fine-tuned version of [TinyLlama/TinyLlama_v1.1](https://huggingface.co/TinyLlama/TinyLlama_v1.1) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4833
- Accuracy: 0.8289
- Precision Weighted: 0.8220
- Recall Weighted: 0.8289
- F1 Weighted: 0.8239
- Mcc: 0.7203
- Balanced Accuracy: 0.7724
- Macro Fnr: 0.2276
- Macro Fpr: 0.0897
- Macro Specificity: 0.9103
- Per Class: {'jailbreaking': {'TP': 228, 'FP': 103, 'FN': 161, 'TN': 1437, 'FNR': 0.4138817480719794, 'FPR': 0.06688311688311688, 'Specificity': 0.9331168831168831}, 'prompt injection': {'TP': 439, 'FP': 112, 'FN': 131, 'TN': 1247, 'FNR': 0.22982456140350876, 'FPR': 0.08241353936718175, 'Specificity': 0.9175864606328182}, 'unharmful': {'TP': 932, 'FP': 115, 'FN': 38, 'TN': 844, 'FNR': 0.03917525773195876, 'FPR': 0.11991657977059438, 'Specificity': 0.8800834202294057}}

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 15
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Accuracy | Precision Weighted | Recall Weighted | F1 Weighted | Mcc    | Balanced Accuracy | Macro Fnr | Macro Fpr | Macro Specificity | Per Class                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|:-------------:|:-----:|:-----:|:---------------:|:--------:|:------------------:|:---------------:|:-----------:|:------:|:-----------------:|:---------:|:---------:|:-----------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
| 1.0323        | 1.0   | 1107  | 1.0327          | 0.5702   | 0.5595             | 0.5702          | 0.5636      | 0.2924 | 0.5112            | 0.4888    | 0.2364    | 0.7636            | {'jailbreaking': {'TP': 113, 'FP': 209, 'FN': 276, 'TN': 1331, 'FNR': 0.7095115681233933, 'FPR': 0.1357142857142857, 'Specificity': 0.8642857142857143}, 'prompt injection': {'TP': 312, 'FP': 238, 'FN': 258, 'TN': 1121, 'FNR': 0.45263157894736844, 'FPR': 0.1751287711552612, 'Specificity': 0.8248712288447387}, 'unharmful': {'TP': 675, 'FP': 382, 'FN': 295, 'TN': 577, 'FNR': 0.30412371134020616, 'FPR': 0.3983315954118874, 'Specificity': 0.6016684045881127}}     |
| 0.7496        | 2.0   | 2214  | 0.7073          | 0.7252   | 0.7112             | 0.7252          | 0.7144      | 0.5463 | 0.6529            | 0.3471    | 0.1502    | 0.8498            | {'jailbreaking': {'TP': 154, 'FP': 132, 'FN': 235, 'TN': 1408, 'FNR': 0.6041131105398457, 'FPR': 0.08571428571428572, 'Specificity': 0.9142857142857143}, 'prompt injection': {'TP': 386, 'FP': 163, 'FN': 184, 'TN': 1196, 'FNR': 0.32280701754385965, 'FPR': 0.1199411331861663, 'Specificity': 0.8800588668138337}, 'unharmful': {'TP': 859, 'FP': 235, 'FN': 111, 'TN': 724, 'FNR': 0.11443298969072165, 'FPR': 0.24504692387904067, 'Specificity': 0.7549530761209593}}   |
| 0.5695        | 3.0   | 3321  | 0.6096          | 0.7766   | 0.7636             | 0.7766          | 0.7654      | 0.6323 | 0.7031            | 0.2969    | 0.1215    | 0.8785            | {'jailbreaking': {'TP': 171, 'FP': 106, 'FN': 218, 'TN': 1434, 'FNR': 0.5604113110539846, 'FPR': 0.06883116883116883, 'Specificity': 0.9311688311688312}, 'prompt injection': {'TP': 417, 'FP': 141, 'FN': 153, 'TN': 1218, 'FNR': 0.26842105263157895, 'FPR': 0.10375275938189846, 'Specificity': 0.8962472406181016}, 'unharmful': {'TP': 910, 'FP': 184, 'FN': 60, 'TN': 775, 'FNR': 0.061855670103092786, 'FPR': 0.19186652763295098, 'Specificity': 0.808133472367049}}   |
| 0.5059        | 4.0   | 4428  | 0.5686          | 0.7926   | 0.7817             | 0.7926          | 0.7843      | 0.6596 | 0.7245            | 0.2755    | 0.1107    | 0.8893            | {'jailbreaking': {'TP': 191, 'FP': 114, 'FN': 198, 'TN': 1426, 'FNR': 0.5089974293059126, 'FPR': 0.07402597402597402, 'Specificity': 0.925974025974026}, 'prompt injection': {'TP': 419, 'FP': 131, 'FN': 151, 'TN': 1228, 'FNR': 0.2649122807017544, 'FPR': 0.0963944076526858, 'Specificity': 0.9036055923473142}, 'unharmful': {'TP': 919, 'FP': 155, 'FN': 51, 'TN': 804, 'FNR': 0.05257731958762887, 'FPR': 0.1616266944734098, 'Specificity': 0.8383733055265902}}       |
| 0.4583        | 5.0   | 5535  | 0.5413          | 0.8056   | 0.7953             | 0.8056          | 0.7977      | 0.6812 | 0.7389            | 0.2611    | 0.1032    | 0.8968            | {'jailbreaking': {'TP': 199, 'FP': 109, 'FN': 190, 'TN': 1431, 'FNR': 0.4884318766066838, 'FPR': 0.07077922077922078, 'Specificity': 0.9292207792207792}, 'prompt injection': {'TP': 426, 'FP': 126, 'FN': 144, 'TN': 1233, 'FNR': 0.25263157894736843, 'FPR': 0.09271523178807947, 'Specificity': 0.9072847682119205}, 'unharmful': {'TP': 929, 'FP': 140, 'FN': 41, 'TN': 819, 'FNR': 0.042268041237113405, 'FPR': 0.145985401459854, 'Specificity': 0.8540145985401459}}    |
| 0.4763        | 6.0   | 6642  | 0.5243          | 0.8113   | 0.8028             | 0.8113          | 0.8052      | 0.6910 | 0.7498            | 0.2502    | 0.0994    | 0.9006            | {'jailbreaking': {'TP': 212, 'FP': 113, 'FN': 177, 'TN': 1427, 'FNR': 0.455012853470437, 'FPR': 0.07337662337662337, 'Specificity': 0.9266233766233766}, 'prompt injection': {'TP': 428, 'FP': 120, 'FN': 142, 'TN': 1239, 'FNR': 0.24912280701754386, 'FPR': 0.08830022075055188, 'Specificity': 0.9116997792494481}, 'unharmful': {'TP': 925, 'FP': 131, 'FN': 45, 'TN': 828, 'FNR': 0.04639175257731959, 'FPR': 0.13660062565172054, 'Specificity': 0.8633993743482794}}    |
| 0.4283        | 7.0   | 7749  | 0.5095          | 0.8170   | 0.8083             | 0.8170          | 0.8104      | 0.7003 | 0.7546            | 0.2454    | 0.0969    | 0.9031            | {'jailbreaking': {'TP': 213, 'FP': 105, 'FN': 176, 'TN': 1435, 'FNR': 0.4524421593830334, 'FPR': 0.06818181818181818, 'Specificity': 0.9318181818181818}, 'prompt injection': {'TP': 430, 'FP': 118, 'FN': 140, 'TN': 1241, 'FNR': 0.24561403508771928, 'FPR': 0.08682855040470934, 'Specificity': 0.9131714495952906}, 'unharmful': {'TP': 933, 'FP': 130, 'FN': 37, 'TN': 829, 'FNR': 0.03814432989690722, 'FPR': 0.13555787278415016, 'Specificity': 0.8644421272158499}}   |
| 0.4119        | 8.0   | 8856  | 0.5033          | 0.8191   | 0.8116             | 0.8191          | 0.8135      | 0.7041 | 0.7592            | 0.2408    | 0.0951    | 0.9049            | {'jailbreaking': {'TP': 223, 'FP': 118, 'FN': 166, 'TN': 1422, 'FNR': 0.4267352185089974, 'FPR': 0.07662337662337662, 'Specificity': 0.9233766233766234}, 'prompt injection': {'TP': 422, 'FP': 105, 'FN': 148, 'TN': 1254, 'FNR': 0.2596491228070175, 'FPR': 0.0772626931567329, 'Specificity': 0.9227373068432672}, 'unharmful': {'TP': 935, 'FP': 126, 'FN': 35, 'TN': 833, 'FNR': 0.03608247422680412, 'FPR': 0.13138686131386862, 'Specificity': 0.8686131386861314}}     |
| 0.412         | 9.0   | 9963  | 0.4955          | 0.8253   | 0.8178             | 0.8253          | 0.8199      | 0.7143 | 0.7671            | 0.2329    | 0.0916    | 0.9084            | {'jailbreaking': {'TP': 222, 'FP': 103, 'FN': 167, 'TN': 1437, 'FNR': 0.42930591259640105, 'FPR': 0.06688311688311688, 'Specificity': 0.9331168831168831}, 'prompt injection': {'TP': 440, 'FP': 118, 'FN': 130, 'TN': 1241, 'FNR': 0.22807017543859648, 'FPR': 0.08682855040470934, 'Specificity': 0.9131714495952906}, 'unharmful': {'TP': 930, 'FP': 116, 'FN': 40, 'TN': 843, 'FNR': 0.041237113402061855, 'FPR': 0.12095933263816476, 'Specificity': 0.8790406673618353}} |
| 0.496         | 10.0  | 11070 | 0.4926          | 0.8289   | 0.8214             | 0.8289          | 0.8232      | 0.7202 | 0.7703            | 0.2297    | 0.0903    | 0.9097            | {'jailbreaking': {'TP': 224, 'FP': 98, 'FN': 165, 'TN': 1442, 'FNR': 0.4241645244215938, 'FPR': 0.06363636363636363, 'Specificity': 0.9363636363636364}, 'prompt injection': {'TP': 439, 'FP': 113, 'FN': 131, 'TN': 1246, 'FNR': 0.22982456140350876, 'FPR': 0.08314937454010302, 'Specificity': 0.9168506254598969}, 'unharmful': {'TP': 936, 'FP': 119, 'FN': 34, 'TN': 840, 'FNR': 0.03505154639175258, 'FPR': 0.12408759124087591, 'Specificity': 0.8759124087591241}}    |
| 0.428         | 11.0  | 12177 | 0.4890          | 0.8258   | 0.8191             | 0.8258          | 0.8207      | 0.7154 | 0.7677            | 0.2323    | 0.0913    | 0.9087            | {'jailbreaking': {'TP': 230, 'FP': 117, 'FN': 159, 'TN': 1423, 'FNR': 0.4087403598971722, 'FPR': 0.07597402597402597, 'Specificity': 0.924025974025974}, 'prompt injection': {'TP': 424, 'FP': 99, 'FN': 146, 'TN': 1260, 'FNR': 0.256140350877193, 'FPR': 0.0728476821192053, 'Specificity': 0.9271523178807947}, 'unharmful': {'TP': 939, 'FP': 120, 'FN': 31, 'TN': 839, 'FNR': 0.031958762886597936, 'FPR': 0.1251303441084463, 'Specificity': 0.8748696558915537}}        |
| 0.4103        | 12.0  | 13284 | 0.4866          | 0.8269   | 0.8191             | 0.8269          | 0.8206      | 0.7166 | 0.7669            | 0.2331    | 0.0922    | 0.9078            | {'jailbreaking': {'TP': 221, 'FP': 99, 'FN': 168, 'TN': 1441, 'FNR': 0.4318766066838046, 'FPR': 0.06428571428571428, 'Specificity': 0.9357142857142857}, 'prompt injection': {'TP': 437, 'FP': 107, 'FN': 133, 'TN': 1252, 'FNR': 0.23333333333333334, 'FPR': 0.07873436350257543, 'Specificity': 0.9212656364974245}, 'unharmful': {'TP': 937, 'FP': 128, 'FN': 33, 'TN': 831, 'FNR': 0.03402061855670103, 'FPR': 0.1334723670490094, 'Specificity': 0.8665276329509907}}     |
| 0.4009        | 13.0  | 14391 | 0.4833          | 0.8289   | 0.8220             | 0.8289          | 0.8239      | 0.7203 | 0.7724            | 0.2276    | 0.0897    | 0.9103            | {'jailbreaking': {'TP': 228, 'FP': 103, 'FN': 161, 'TN': 1437, 'FNR': 0.4138817480719794, 'FPR': 0.06688311688311688, 'Specificity': 0.9331168831168831}, 'prompt injection': {'TP': 439, 'FP': 112, 'FN': 131, 'TN': 1247, 'FNR': 0.22982456140350876, 'FPR': 0.08241353936718175, 'Specificity': 0.9175864606328182}, 'unharmful': {'TP': 932, 'FP': 115, 'FN': 38, 'TN': 844, 'FNR': 0.03917525773195876, 'FPR': 0.11991657977059438, 'Specificity': 0.8800834202294057}}   |
| 0.4242        | 14.0  | 15498 | 0.4834          | 0.8284   | 0.8211             | 0.8284          | 0.8228      | 0.7193 | 0.7700            | 0.2300    | 0.0906    | 0.9094            | {'jailbreaking': {'TP': 226, 'FP': 105, 'FN': 163, 'TN': 1435, 'FNR': 0.4190231362467866, 'FPR': 0.06818181818181818, 'Specificity': 0.9318181818181818}, 'prompt injection': {'TP': 435, 'FP': 104, 'FN': 135, 'TN': 1255, 'FNR': 0.23684210526315788, 'FPR': 0.07652685798381163, 'Specificity': 0.9234731420161884}, 'unharmful': {'TP': 937, 'FP': 122, 'FN': 33, 'TN': 837, 'FNR': 0.03402061855670103, 'FPR': 0.12721584984358708, 'Specificity': 0.872784150156413}}    |
| 0.3859        | 15.0  | 16605 | 0.4829          | 0.8269   | 0.8197             | 0.8269          | 0.8215      | 0.7168 | 0.7692            | 0.2308    | 0.0912    | 0.9088            | {'jailbreaking': {'TP': 226, 'FP': 106, 'FN': 163, 'TN': 1434, 'FNR': 0.4190231362467866, 'FPR': 0.06883116883116883, 'Specificity': 0.9311688311688312}, 'prompt injection': {'TP': 436, 'FP': 107, 'FN': 134, 'TN': 1252, 'FNR': 0.23508771929824562, 'FPR': 0.07873436350257543, 'Specificity': 0.9212656364974245}, 'unharmful': {'TP': 933, 'FP': 121, 'FN': 37, 'TN': 838, 'FNR': 0.03814432989690722, 'FPR': 0.1261730969760167, 'Specificity': 0.8738269030239834}}    |


### Framework versions

- PEFT 0.17.1
- Transformers 4.53.3
- Pytorch 2.6.0+cu124
- Datasets 4.3.0
- Tokenizers 0.21.4