harun27 commited on
Commit
9962a2a
·
verified ·
1 Parent(s): 50a9011

End of training

Browse files
Files changed (2) hide show
  1. README.md +13 -60
  2. model.safetensors +1 -1
README.md CHANGED
@@ -16,8 +16,8 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 1.0671
20
- - Classification Report: {'0': {'precision': 0.9478584729981379, 'recall': 0.9591708542713567, 'f1-score': 0.9534811114580081, 'support': 1592.0}, '1': {'precision': 0.7302904564315352, 'recall': 0.676923076923077, 'f1-score': 0.7025948103792415, 'support': 260.0}, 'accuracy': 0.9195464362850972, 'macro avg': {'precision': 0.8390744647148365, 'recall': 0.8180469655972169, 'f1-score': 0.8280379609186248, 'support': 1852.0}, 'weighted avg': {'precision': 0.9173143670006667, 'recall': 0.9195464362850972, 'f1-score': 0.9182594925160646, 'support': 1852.0}}
21
 
22
  ## Model description
23
 
@@ -37,71 +37,24 @@ More information needed
37
 
38
  The following hyperparameters were used during training:
39
  - learning_rate: 5e-06
40
- - train_batch_size: 256
41
- - eval_batch_size: 256
42
  - seed: 42
43
  - distributed_type: multi-GPU
44
- - num_devices: 3
45
- - total_train_batch_size: 768
46
- - total_eval_batch_size: 768
47
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
- - num_epochs: 50
50
 
51
  ### Training results
52
 
53
- | Training Loss | Epoch | Step | Validation Loss | Classification Report |
54
- |:-------------:|:-----:|:----:|:---------------:|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
55
- | No log | 1.0 | 9 | 0.4658 | {'0': {'precision': 0.9748237663645518, 'recall': 0.6080402010050251, 'f1-score': 0.7489361702127659, 'support': 1592.0}, '1': {'precision': 0.27357392316647267, 'recall': 0.9038461538461539, 'f1-score': 0.42001787310098304, 'support': 260.0}, 'accuracy': 0.6495680345572354, 'macro avg': {'precision': 0.6241988447655122, 'recall': 0.7559431774255895, 'f1-score': 0.5844770216568744, 'support': 1852.0}, 'weighted avg': {'precision': 0.8763761641877157, 'recall': 0.6495680345572354, 'f1-score': 0.7027597354130556, 'support': 1852.0}} |
56
- | No log | 2.0 | 18 | 0.4287 | {'0': {'precision': 0.9786324786324786, 'recall': 0.7192211055276382, 'f1-score': 0.8291093410572049, 'support': 1592.0}, '1': {'precision': 0.34457478005865105, 'recall': 0.9038461538461539, 'f1-score': 0.4989384288747346, 'support': 260.0}, 'accuracy': 0.7451403887688985, 'macro avg': {'precision': 0.6616036293455648, 'recall': 0.811533629686896, 'f1-score': 0.6640238849659698, 'support': 1852.0}, 'weighted avg': {'precision': 0.8896178989190903, 'recall': 0.7451403887688985, 'f1-score': 0.782757053169817, 'support': 1852.0}} |
57
- | No log | 3.0 | 27 | 0.3809 | {'0': {'precision': 0.9813311688311688, 'recall': 0.7594221105527639, 'f1-score': 0.8562322946175638, 'support': 1592.0}, '1': {'precision': 0.38225806451612904, 'recall': 0.9115384615384615, 'f1-score': 0.5386363636363637, 'support': 260.0}, 'accuracy': 0.7807775377969762, 'macro avg': {'precision': 0.6817946166736489, 'recall': 0.8354802860456128, 'f1-score': 0.6974343291269638, 'support': 1852.0}, 'weighted avg': {'precision': 0.8972280332361848, 'recall': 0.7807775377969762, 'f1-score': 0.8116453928599439, 'support': 1852.0}} |
58
- | No log | 4.0 | 36 | 0.3512 | {'0': {'precision': 0.9744168547780286, 'recall': 0.8134422110552764, 'f1-score': 0.8866826429305033, 'support': 1592.0}, '1': {'precision': 0.4321223709369025, 'recall': 0.8692307692307693, 'f1-score': 0.5772669220945083, 'support': 260.0}, 'accuracy': 0.8212742980561555, 'macro avg': {'precision': 0.7032696128574656, 'recall': 0.8413364901430228, 'f1-score': 0.7319747825125058, 'support': 1852.0}, 'weighted avg': {'precision': 0.8982847998111317, 'recall': 0.8212742980561555, 'f1-score': 0.8432441508044997, 'support': 1852.0}} |
59
- | No log | 5.0 | 45 | 0.3344 | {'0': {'precision': 0.9857482185273159, 'recall': 0.782035175879397, 'f1-score': 0.87215411558669, 'support': 1592.0}, '1': {'precision': 0.41086587436332767, 'recall': 0.9307692307692308, 'f1-score': 0.5700824499411072, 'support': 260.0}, 'accuracy': 0.8029157667386609, 'macro avg': {'precision': 0.6983070464453218, 'recall': 0.8564022033243139, 'f1-score': 0.7211182827638987, 'support': 1852.0}, 'weighted avg': {'precision': 0.9050411939686567, 'recall': 0.8029157667386609, 'f1-score': 0.8297466463275909, 'support': 1852.0}} |
60
- | No log | 6.0 | 54 | 0.3259 | {'0': {'precision': 0.9739069111424542, 'recall': 0.867462311557789, 'f1-score': 0.9176079734219269, 'support': 1592.0}, '1': {'precision': 0.5138248847926268, 'recall': 0.8576923076923076, 'f1-score': 0.6426512968299711, 'support': 260.0}, 'accuracy': 0.8660907127429806, 'macro avg': {'precision': 0.7438658979675405, 'recall': 0.8625773096250483, 'f1-score': 0.780129635125949, 'support': 1852.0}, 'weighted avg': {'precision': 0.9093165618708802, 'recall': 0.8660907127429806, 'f1-score': 0.8790071440947624, 'support': 1852.0}} |
61
- | No log | 7.0 | 63 | 0.3234 | {'0': {'precision': 0.9715672676837726, 'recall': 0.8800251256281407, 'f1-score': 0.9235332893869479, 'support': 1592.0}, '1': {'precision': 0.5341463414634147, 'recall': 0.8423076923076923, 'f1-score': 0.6537313432835821, 'support': 260.0}, 'accuracy': 0.8747300215982722, 'macro avg': {'precision': 0.7528568045735936, 'recall': 0.8611664089679165, 'f1-score': 0.788632316335265, 'support': 1852.0}, 'weighted avg': {'precision': 0.91015828236126, 'recall': 0.8747300215982722, 'f1-score': 0.8856561263270801, 'support': 1852.0}} |
62
- | No log | 8.0 | 72 | 0.3283 | {'0': {'precision': 0.9882445141065831, 'recall': 0.7920854271356784, 'f1-score': 0.8793584379358438, 'support': 1592.0}, '1': {'precision': 0.4253472222222222, 'recall': 0.9423076923076923, 'f1-score': 0.5861244019138756, 'support': 260.0}, 'accuracy': 0.8131749460043196, 'macro avg': {'precision': 0.7067958681644027, 'recall': 0.8671965597216853, 'f1-score': 0.7327414199248596, 'support': 1852.0}, 'weighted avg': {'precision': 0.9092200562826448, 'recall': 0.8131749460043196, 'f1-score': 0.8381916726195847, 'support': 1852.0}} |
63
- | No log | 9.0 | 81 | 0.4118 | {'0': {'precision': 0.9485570890840652, 'recall': 0.949748743718593, 'f1-score': 0.9491525423728814, 'support': 1592.0}, '1': {'precision': 0.689922480620155, 'recall': 0.6846153846153846, 'f1-score': 0.6872586872586872, 'support': 260.0}, 'accuracy': 0.9125269978401728, 'macro avg': {'precision': 0.8192397848521101, 'recall': 0.8171820641669888, 'f1-score': 0.8182056148157844, 'support': 1852.0}, 'weighted avg': {'precision': 0.9122476948072744, 'recall': 0.9125269978401728, 'f1-score': 0.9123855864713207, 'support': 1852.0}} |
64
- | No log | 10.0 | 90 | 0.3219 | {'0': {'precision': 0.9709141274238227, 'recall': 0.8806532663316583, 'f1-score': 0.9235836627140975, 'support': 1592.0}, '1': {'precision': 0.5343137254901961, 'recall': 0.8384615384615385, 'f1-score': 0.6526946107784432, 'support': 260.0}, 'accuracy': 0.8747300215982722, 'macro avg': {'precision': 0.7526139264570093, 'recall': 0.8595574023965984, 'f1-score': 0.7881391367462703, 'support': 1852.0}, 'weighted avg': {'precision': 0.9096203344957757, 'recall': 0.8747300215982722, 'f1-score': 0.8855538822047724, 'support': 1852.0}} |
65
- | No log | 11.0 | 99 | 0.3105 | {'0': {'precision': 0.9749826268241835, 'recall': 0.8812814070351759, 'f1-score': 0.9257670735730782, 'support': 1592.0}, '1': {'precision': 0.5423728813559322, 'recall': 0.8615384615384616, 'f1-score': 0.6656760772659732, 'support': 260.0}, 'accuracy': 0.8785097192224622, 'macro avg': {'precision': 0.7586777540900578, 'recall': 0.8714099342868187, 'f1-score': 0.7957215754195257, 'support': 1852.0}, 'weighted avg': {'precision': 0.9142490772444073, 'recall': 0.8785097192224622, 'f1-score': 0.8892532187999425, 'support': 1852.0}} |
66
- | No log | 12.0 | 108 | 0.3133 | {'0': {'precision': 0.9754901960784313, 'recall': 0.875, 'f1-score': 0.9225165562913907, 'support': 1592.0}, '1': {'precision': 0.5306603773584906, 'recall': 0.8653846153846154, 'f1-score': 0.6578947368421053, 'support': 260.0}, 'accuracy': 0.8736501079913607, 'macro avg': {'precision': 0.7530752867184609, 'recall': 0.8701923076923077, 'f1-score': 0.790205646566748, 'support': 1852.0}, 'weighted avg': {'precision': 0.9130410854590013, 'recall': 0.8736501079913607, 'f1-score': 0.8853666248352274, 'support': 1852.0}} |
67
- | No log | 13.0 | 117 | 0.3423 | {'0': {'precision': 0.9603896103896103, 'recall': 0.9290201005025126, 'f1-score': 0.9444444444444444, 'support': 1592.0}, '1': {'precision': 0.6378205128205128, 'recall': 0.7653846153846153, 'f1-score': 0.6958041958041958, 'support': 260.0}, 'accuracy': 0.906047516198704, 'macro avg': {'precision': 0.7991050616050616, 'recall': 0.8472023579435639, 'f1-score': 0.8201243201243201, 'support': 1852.0}, 'weighted avg': {'precision': 0.9151045318971884, 'recall': 0.906047516198704, 'f1-score': 0.9095381460392259, 'support': 1852.0}} |
68
- | No log | 14.0 | 126 | 0.5012 | {'0': {'precision': 0.9386650631389056, 'recall': 0.9805276381909548, 'f1-score': 0.9591397849462365, 'support': 1592.0}, '1': {'precision': 0.8359788359788359, 'recall': 0.6076923076923076, 'f1-score': 0.7037861915367484, 'support': 260.0}, 'accuracy': 0.9281857451403888, 'macro avg': {'precision': 0.8873219495588708, 'recall': 0.7941099729416312, 'f1-score': 0.8314629882414925, 'support': 1852.0}, 'weighted avg': {'precision': 0.924249070125073, 'recall': 0.9281857451403888, 'f1-score': 0.9232910083336735, 'support': 1852.0}} |
69
- | No log | 15.0 | 135 | 0.3743 | {'0': {'precision': 0.9571611253196931, 'recall': 0.9403266331658291, 'f1-score': 0.9486692015209125, 'support': 1592.0}, '1': {'precision': 0.6701388888888888, 'recall': 0.7423076923076923, 'f1-score': 0.7043795620437956, 'support': 260.0}, 'accuracy': 0.9125269978401728, 'macro avg': {'precision': 0.8136500071042909, 'recall': 0.8413171627367607, 'f1-score': 0.826524381782354, 'support': 1852.0}, 'weighted avg': {'precision': 0.9168664269006817, 'recall': 0.9125269978401728, 'f1-score': 0.9143736797800646, 'support': 1852.0}} |
70
- | No log | 16.0 | 144 | 0.3255 | {'0': {'precision': 0.9732510288065843, 'recall': 0.8913316582914573, 'f1-score': 0.9304918032786885, 'support': 1592.0}, '1': {'precision': 0.5609137055837563, 'recall': 0.85, 'f1-score': 0.6758409785932722, 'support': 260.0}, 'accuracy': 0.8855291576673866, 'macro avg': {'precision': 0.7670823671951703, 'recall': 0.8706658291457287, 'f1-score': 0.8031663909359803, 'support': 1852.0}, 'weighted avg': {'precision': 0.9153634996284336, 'recall': 0.8855291576673866, 'f1-score': 0.8947416875021181, 'support': 1852.0}} |
71
- | No log | 17.0 | 153 | 0.3262 | {'0': {'precision': 0.9744499645138396, 'recall': 0.8624371859296482, 'f1-score': 0.915028323892036, 'support': 1592.0}, '1': {'precision': 0.5056433408577878, 'recall': 0.8615384615384616, 'f1-score': 0.6372688477951636, 'support': 260.0}, 'accuracy': 0.8623110151187905, 'macro avg': {'precision': 0.7400466526858137, 'recall': 0.8619878237340549, 'f1-score': 0.7761485858435998, 'support': 1852.0}, 'weighted avg': {'precision': 0.9086347797673097, 'recall': 0.8623110151187905, 'f1-score': 0.8760340129929071, 'support': 1852.0}} |
72
- | No log | 18.0 | 162 | 0.4057 | {'0': {'precision': 0.9568576947842885, 'recall': 0.9334170854271356, 'f1-score': 0.9449920508744039, 'support': 1592.0}, '1': {'precision': 0.6454849498327759, 'recall': 0.7423076923076923, 'f1-score': 0.6905187835420393, 'support': 260.0}, 'accuracy': 0.9065874730021598, 'macro avg': {'precision': 0.8011713223085322, 'recall': 0.8378623888674139, 'f1-score': 0.8177554172082215, 'support': 1852.0}, 'weighted avg': {'precision': 0.9131444584520026, 'recall': 0.9065874730021598, 'f1-score': 0.9092668621560372, 'support': 1852.0}} |
73
- | No log | 19.0 | 171 | 0.3325 | {'0': {'precision': 0.9760056457304164, 'recall': 0.8687185929648241, 'f1-score': 0.9192422731804586, 'support': 1592.0}, '1': {'precision': 0.5195402298850574, 'recall': 0.8692307692307693, 'f1-score': 0.6503597122302158, 'support': 260.0}, 'accuracy': 0.8687904967602592, 'macro avg': {'precision': 0.7477729378077369, 'recall': 0.8689746810977967, 'f1-score': 0.7848009927053372, 'support': 1852.0}, 'weighted avg': {'precision': 0.9119230279551499, 'recall': 0.8687904967602592, 'f1-score': 0.8814941814703814, 'support': 1852.0}} |
74
- | No log | 20.0 | 180 | 0.3462 | {'0': {'precision': 0.983957219251337, 'recall': 0.8090452261306532, 'f1-score': 0.8879696656325405, 'support': 1592.0}, '1': {'precision': 0.44014732965009207, 'recall': 0.9192307692307692, 'f1-score': 0.5952677459526775, 'support': 260.0}, 'accuracy': 0.8245140388768899, 'macro avg': {'precision': 0.7120522744507145, 'recall': 0.8641379976807112, 'f1-score': 0.741618705792609, 'support': 1852.0}, 'weighted avg': {'precision': 0.9076124183353954, 'recall': 0.8245140388768899, 'f1-score': 0.8468776034744605, 'support': 1852.0}} |
75
- | No log | 21.0 | 189 | 0.3389 | {'0': {'precision': 0.9761904761904762, 'recall': 0.8756281407035176, 'f1-score': 0.9231788079470199, 'support': 1592.0}, '1': {'precision': 0.5330188679245284, 'recall': 0.8692307692307693, 'f1-score': 0.6608187134502924, 'support': 260.0}, 'accuracy': 0.8747300215982722, 'macro avg': {'precision': 0.7546046720575023, 'recall': 0.8724294549671434, 'f1-score': 0.7919987606986562, 'support': 1852.0}, 'weighted avg': {'precision': 0.9139741596952567, 'recall': 0.8747300215982722, 'f1-score': 0.8863463972725334, 'support': 1852.0}} |
76
- | No log | 22.0 | 198 | 0.5140 | {'0': {'precision': 0.9451294697903823, 'recall': 0.9629396984924623, 'f1-score': 0.9539514623522091, 'support': 1592.0}, '1': {'precision': 0.7434782608695653, 'recall': 0.6576923076923077, 'f1-score': 0.6979591836734694, 'support': 260.0}, 'accuracy': 0.9200863930885529, 'macro avg': {'precision': 0.8443038653299737, 'recall': 0.810316003092385, 'f1-score': 0.8259553230128392, 'support': 1852.0}, 'weighted avg': {'precision': 0.9168199048230969, 'recall': 0.9200863930885529, 'f1-score': 0.918013021500982, 'support': 1852.0}} |
77
- | No log | 23.0 | 207 | 0.4316 | {'0': {'precision': 0.9589480436177037, 'recall': 0.9390703517587939, 'f1-score': 0.948905109489051, 'support': 1592.0}, '1': {'precision': 0.6689419795221843, 'recall': 0.7538461538461538, 'f1-score': 0.7088607594936709, 'support': 260.0}, 'accuracy': 0.9130669546436285, 'macro avg': {'precision': 0.813945011569944, 'recall': 0.8464582528024739, 'f1-score': 0.828882934491361, 'support': 1852.0}, 'weighted avg': {'precision': 0.9182344493062377, 'recall': 0.9130669546436285, 'f1-score': 0.9152055787121617, 'support': 1852.0}} |
78
- | No log | 24.0 | 216 | 0.4008 | {'0': {'precision': 0.9622395833333334, 'recall': 0.928391959798995, 'f1-score': 0.9450127877237852, 'support': 1592.0}, '1': {'precision': 0.6392405063291139, 'recall': 0.7769230769230769, 'f1-score': 0.7013888888888888, 'support': 260.0}, 'accuracy': 0.9071274298056156, 'macro avg': {'precision': 0.8007400448312236, 'recall': 0.852657518361036, 'f1-score': 0.823200838306337, 'support': 1852.0}, 'weighted avg': {'precision': 0.9168941405573631, 'recall': 0.9071274298056156, 'f1-score': 0.9108107284921042, 'support': 1852.0}} |
79
- | No log | 25.0 | 225 | 0.5363 | {'0': {'precision': 0.9544592030360531, 'recall': 0.9478643216080402, 'f1-score': 0.9511503309171131, 'support': 1592.0}, '1': {'precision': 0.6937269372693727, 'recall': 0.7230769230769231, 'f1-score': 0.7080979284369114, 'support': 260.0}, 'accuracy': 0.9163066954643628, 'macro avg': {'precision': 0.8240930701527129, 'recall': 0.8354706223424817, 'f1-score': 0.8296241296770123, 'support': 1852.0}, 'weighted avg': {'precision': 0.9178553212329554, 'recall': 0.9163066954643628, 'f1-score': 0.9170285033550978, 'support': 1852.0}} |
80
- | No log | 26.0 | 234 | 0.4987 | {'0': {'precision': 0.9601029601029601, 'recall': 0.9371859296482412, 'f1-score': 0.9485060394151303, 'support': 1592.0}, '1': {'precision': 0.6644295302013423, 'recall': 0.7615384615384615, 'f1-score': 0.7096774193548387, 'support': 260.0}, 'accuracy': 0.9125269978401728, 'macro avg': {'precision': 0.8122662451521512, 'recall': 0.8493621955933514, 'f1-score': 0.8290917293849845, 'support': 1852.0}, 'weighted avg': {'precision': 0.9185937312830786, 'recall': 0.9125269978401728, 'f1-score': 0.9149771834671412, 'support': 1852.0}} |
81
- | No log | 27.0 | 243 | 0.5779 | {'0': {'precision': 0.9502831969792322, 'recall': 0.9484924623115578, 'f1-score': 0.949386985224772, 'support': 1592.0}, '1': {'precision': 0.688212927756654, 'recall': 0.6961538461538461, 'f1-score': 0.6921606118546845, 'support': 260.0}, 'accuracy': 0.9130669546436285, 'macro avg': {'precision': 0.8192480623679431, 'recall': 0.822323154232702, 'f1-score': 0.8207737985397283, 'support': 1852.0}, 'weighted avg': {'precision': 0.9134914745181792, 'recall': 0.9130669546436285, 'f1-score': 0.9132752913391226, 'support': 1852.0}} |
82
- | No log | 28.0 | 252 | 0.7377 | {'0': {'precision': 0.9432799013563502, 'recall': 0.9610552763819096, 'f1-score': 0.9520846297448662, 'support': 1592.0}, '1': {'precision': 0.7304347826086957, 'recall': 0.6461538461538462, 'f1-score': 0.6857142857142857, 'support': 260.0}, 'accuracy': 0.9168466522678186, 'macro avg': {'precision': 0.8368573419825229, 'recall': 0.8036045612678779, 'f1-score': 0.818899457729576, 'support': 1852.0}, 'weighted avg': {'precision': 0.9133988371693144, 'recall': 0.9168466522678186, 'f1-score': 0.9146892250753462, 'support': 1852.0}} |
83
- | No log | 29.0 | 261 | 0.5538 | {'0': {'precision': 0.9559386973180076, 'recall': 0.9403266331658291, 'f1-score': 0.948068397720076, 'support': 1592.0}, '1': {'precision': 0.6678321678321678, 'recall': 0.7346153846153847, 'f1-score': 0.6996336996336996, 'support': 260.0}, 'accuracy': 0.9114470842332614, 'macro avg': {'precision': 0.8118854325750877, 'recall': 0.8374710088906069, 'f1-score': 0.8238510486768877, 'support': 1852.0}, 'weighted avg': {'precision': 0.9154917763318746, 'recall': 0.9114470842332614, 'f1-score': 0.9131909563040621, 'support': 1852.0}} |
84
- | No log | 30.0 | 270 | 0.7011 | {'0': {'precision': 0.9472377405338299, 'recall': 0.9585427135678392, 'f1-score': 0.9528566968467063, 'support': 1592.0}, '1': {'precision': 0.7261410788381742, 'recall': 0.6730769230769231, 'f1-score': 0.6986027944111777, 'support': 260.0}, 'accuracy': 0.9184665226781857, 'macro avg': {'precision': 0.836689409686002, 'recall': 0.8158098183223812, 'f1-score': 0.825729745628942, 'support': 1852.0}, 'weighted avg': {'precision': 0.9161982523908113, 'recall': 0.9184665226781857, 'f1-score': 0.917162304496146, 'support': 1852.0}} |
85
- | No log | 31.0 | 279 | 0.5648 | {'0': {'precision': 0.9532351057014734, 'recall': 0.9346733668341709, 'f1-score': 0.9438629876308278, 'support': 1592.0}, '1': {'precision': 0.6426116838487973, 'recall': 0.7192307692307692, 'f1-score': 0.6787658802177858, 'support': 260.0}, 'accuracy': 0.9044276457883369, 'macro avg': {'precision': 0.7979233947751354, 'recall': 0.82695206803247, 'f1-score': 0.8113144339243068, 'support': 1852.0}, 'weighted avg': {'precision': 0.9096270659165405, 'recall': 0.9044276457883369, 'f1-score': 0.9066463310825605, 'support': 1852.0}} |
86
- | No log | 32.0 | 288 | 0.7122 | {'0': {'precision': 0.9472049689440993, 'recall': 0.9579145728643216, 'f1-score': 0.9525296689569019, 'support': 1592.0}, '1': {'precision': 0.7231404958677686, 'recall': 0.6730769230769231, 'f1-score': 0.6972111553784861, 'support': 260.0}, 'accuracy': 0.91792656587473, 'macro avg': {'precision': 0.8351727324059339, 'recall': 0.8154957479706224, 'f1-score': 0.824870412167694, 'support': 1852.0}, 'weighted avg': {'precision': 0.9157488334150249, 'recall': 0.91792656587473, 'f1-score': 0.9166858171586362, 'support': 1852.0}} |
87
- | No log | 33.0 | 297 | 0.6401 | {'0': {'precision': 0.9548919949174078, 'recall': 0.9440954773869347, 'f1-score': 0.9494630448515476, 'support': 1592.0}, '1': {'precision': 0.6798561151079137, 'recall': 0.7269230769230769, 'f1-score': 0.7026022304832714, 'support': 260.0}, 'accuracy': 0.9136069114470843, 'macro avg': {'precision': 0.8173740550126607, 'recall': 0.8355092771550058, 'f1-score': 0.8260326376674095, 'support': 1852.0}, 'weighted avg': {'precision': 0.9162800463480404, 'recall': 0.9136069114470843, 'f1-score': 0.9148065590331071, 'support': 1852.0}} |
88
- | No log | 34.0 | 306 | 0.6557 | {'0': {'precision': 0.9558823529411765, 'recall': 0.9390703517587939, 'f1-score': 0.9474017743979721, 'support': 1592.0}, '1': {'precision': 0.6631944444444444, 'recall': 0.7346153846153847, 'f1-score': 0.6970802919708029, 'support': 260.0}, 'accuracy': 0.9103671706263499, 'macro avg': {'precision': 0.8095383986928104, 'recall': 0.8368428681870893, 'f1-score': 0.8222410331843875, 'support': 1852.0}, 'weighted avg': {'precision': 0.9147922577958468, 'recall': 0.9103671706263499, 'f1-score': 0.9122594496511772, 'support': 1852.0}} |
89
- | No log | 35.0 | 315 | 0.7642 | {'0': {'precision': 0.947401377582968, 'recall': 0.9503768844221105, 'f1-score': 0.9488867983693948, 'support': 1592.0}, '1': {'precision': 0.6901960784313725, 'recall': 0.676923076923077, 'f1-score': 0.683495145631068, 'support': 260.0}, 'accuracy': 0.911987041036717, 'macro avg': {'precision': 0.8187987280071702, 'recall': 0.8136499806725938, 'f1-score': 0.8161909720002314, 'support': 1852.0}, 'weighted avg': {'precision': 0.9112926422809082, 'recall': 0.911987041036717, 'f1-score': 0.911628790965526, 'support': 1852.0}} |
90
- | No log | 36.0 | 324 | 0.9518 | {'0': {'precision': 0.9418604651162791, 'recall': 0.9667085427135679, 'f1-score': 0.9541227526348419, 'support': 1592.0}, '1': {'precision': 0.7568807339449541, 'recall': 0.6346153846153846, 'f1-score': 0.6903765690376569, 'support': 260.0}, 'accuracy': 0.9200863930885529, 'macro avg': {'precision': 0.8493705995306167, 'recall': 0.8006619636644763, 'f1-score': 0.8222496608362494, 'support': 1852.0}, 'weighted avg': {'precision': 0.9158913883859635, 'recall': 0.9200863930885529, 'f1-score': 0.9170957506179584, 'support': 1852.0}} |
91
- | No log | 37.0 | 333 | 0.8619 | {'0': {'precision': 0.9507174048658765, 'recall': 0.957286432160804, 'f1-score': 0.9539906103286385, 'support': 1592.0}, '1': {'precision': 0.7269076305220884, 'recall': 0.6961538461538461, 'f1-score': 0.7111984282907662, 'support': 260.0}, 'accuracy': 0.9206263498920086, 'macro avg': {'precision': 0.8388125176939825, 'recall': 0.8267201391573251, 'f1-score': 0.8325945193097024, 'support': 1852.0}, 'weighted avg': {'precision': 0.9192970261783037, 'recall': 0.9206263498920086, 'f1-score': 0.919905314794164, 'support': 1852.0}} |
92
- | No log | 38.0 | 342 | 1.0214 | {'0': {'precision': 0.9417892156862745, 'recall': 0.9654522613065326, 'f1-score': 0.9534739454094293, 'support': 1592.0}, '1': {'precision': 0.75, 'recall': 0.6346153846153846, 'f1-score': 0.6875, 'support': 260.0}, 'accuracy': 0.9190064794816415, 'macro avg': {'precision': 0.8458946078431373, 'recall': 0.8000338229609586, 'f1-score': 0.8204869727047146, 'support': 1852.0}, 'weighted avg': {'precision': 0.9148641638080718, 'recall': 0.9190064794816415, 'f1-score': 0.9161341906543259, 'support': 1852.0}} |
93
- | No log | 39.0 | 351 | 0.9624 | {'0': {'precision': 0.9462631253860407, 'recall': 0.9623115577889447, 'f1-score': 0.9542198691996263, 'support': 1592.0}, '1': {'precision': 0.7424892703862661, 'recall': 0.6653846153846154, 'f1-score': 0.7018255578093306, 'support': 260.0}, 'accuracy': 0.9206263498920086, 'macro avg': {'precision': 0.8443761978861535, 'recall': 0.8138480865867801, 'f1-score': 0.8280227135044784, 'support': 1852.0}, 'weighted avg': {'precision': 0.9176555647489234, 'recall': 0.9206263498920086, 'f1-score': 0.9187865425465611, 'support': 1852.0}} |
94
- | No log | 40.0 | 360 | 0.9834 | {'0': {'precision': 0.9467492260061919, 'recall': 0.960427135678392, 'f1-score': 0.9535391331462426, 'support': 1592.0}, '1': {'precision': 0.7341772151898734, 'recall': 0.6692307692307692, 'f1-score': 0.7002012072434608, 'support': 260.0}, 'accuracy': 0.9195464362850972, 'macro avg': {'precision': 0.8404632205980327, 'recall': 0.8148289524545806, 'f1-score': 0.8268701701948518, 'support': 1852.0}, 'weighted avg': {'precision': 0.9169065031054129, 'recall': 0.9195464362850972, 'f1-score': 0.9179733336134547, 'support': 1852.0}} |
95
- | No log | 41.0 | 369 | 0.9953 | {'0': {'precision': 0.9483509645301804, 'recall': 0.957286432160804, 'f1-score': 0.9527977492966552, 'support': 1592.0}, '1': {'precision': 0.7224489795918367, 'recall': 0.6807692307692308, 'f1-score': 0.700990099009901, 'support': 260.0}, 'accuracy': 0.9184665226781857, 'macro avg': {'precision': 0.8353999720610086, 'recall': 0.8190278314650175, 'f1-score': 0.826893924153278, 'support': 1852.0}, 'weighted avg': {'precision': 0.9166368629729617, 'recall': 0.9184665226781857, 'f1-score': 0.9174467832736768, 'support': 1852.0}} |
96
- | No log | 42.0 | 378 | 0.9888 | {'0': {'precision': 0.9500624219725343, 'recall': 0.9560301507537688, 'f1-score': 0.9530369442705072, 'support': 1592.0}, '1': {'precision': 0.72, 'recall': 0.6923076923076923, 'f1-score': 0.7058823529411765, 'support': 260.0}, 'accuracy': 0.9190064794816415, 'macro avg': {'precision': 0.8350312109862672, 'recall': 0.8241689215307306, 'f1-score': 0.8294596486058419, 'support': 1852.0}, 'weighted avg': {'precision': 0.9177642417820058, 'recall': 0.9190064794816415, 'f1-score': 0.9183392154661736, 'support': 1852.0}} |
97
- | No log | 43.0 | 387 | 0.9745 | {'0': {'precision': 0.9505941213258287, 'recall': 0.9547738693467337, 'f1-score': 0.9526794108429959, 'support': 1592.0}, '1': {'precision': 0.7154150197628458, 'recall': 0.6961538461538461, 'f1-score': 0.7056530214424951, 'support': 260.0}, 'accuracy': 0.9184665226781857, 'macro avg': {'precision': 0.8330045705443372, 'recall': 0.82546385775029, 'f1-score': 0.8291662161427455, 'support': 1852.0}, 'weighted avg': {'precision': 0.9175776167867491, 'recall': 0.9184665226781857, 'f1-score': 0.9179996801496211, 'support': 1852.0}} |
98
- | No log | 44.0 | 396 | 0.9803 | {'0': {'precision': 0.950625, 'recall': 0.9554020100502513, 'f1-score': 0.9530075187969925, 'support': 1592.0}, '1': {'precision': 0.7182539682539683, 'recall': 0.6961538461538461, 'f1-score': 0.70703125, 'support': 260.0}, 'accuracy': 0.9190064794816415, 'macro avg': {'precision': 0.8344394841269842, 'recall': 0.8257779281020488, 'f1-score': 0.8300193843984962, 'support': 1852.0}, 'weighted avg': {'precision': 0.9180027169255032, 'recall': 0.9190064794816415, 'f1-score': 0.918475213242339, 'support': 1852.0}} |
99
- | No log | 45.0 | 405 | 0.9750 | {'0': {'precision': 0.9523809523809523, 'recall': 0.9547738693467337, 'f1-score': 0.9535759096612296, 'support': 1592.0}, '1': {'precision': 0.71875, 'recall': 0.7076923076923077, 'f1-score': 0.7131782945736435, 'support': 260.0}, 'accuracy': 0.9200863930885529, 'macro avg': {'precision': 0.8355654761904762, 'recall': 0.8312330885195207, 'f1-score': 0.8333771021174365, 'support': 1852.0}, 'weighted avg': {'precision': 0.9195817905996092, 'recall': 0.9200863930885529, 'f1-score': 0.9198267844329507, 'support': 1852.0}} |
100
- | No log | 46.0 | 414 | 1.0068 | {'0': {'precision': 0.949438202247191, 'recall': 0.9554020100502513, 'f1-score': 0.9524107701941139, 'support': 1592.0}, '1': {'precision': 0.716, 'recall': 0.6884615384615385, 'f1-score': 0.7019607843137254, 'support': 260.0}, 'accuracy': 0.91792656587473, 'macro avg': {'precision': 0.8327191011235955, 'recall': 0.8219317742558949, 'f1-score': 0.8271857772539197, 'support': 1852.0}, 'weighted avg': {'precision': 0.9166661004198317, 'recall': 0.91792656587473, 'f1-score': 0.9172504050057224, 'support': 1852.0}} |
101
- | No log | 47.0 | 423 | 1.0762 | {'0': {'precision': 0.9473031618102914, 'recall': 0.9597989949748744, 'f1-score': 0.9535101404056162, 'support': 1592.0}, '1': {'precision': 0.7322175732217573, 'recall': 0.6730769230769231, 'f1-score': 0.7014028056112225, 'support': 260.0}, 'accuracy': 0.9195464362850972, 'macro avg': {'precision': 0.8397603675160243, 'recall': 0.8164379590258988, 'f1-score': 0.8274564730084193, 'support': 1852.0}, 'weighted avg': {'precision': 0.9171075608205405, 'recall': 0.9195464362850972, 'f1-score': 0.9181171020435523, 'support': 1852.0}} |
102
- | No log | 48.0 | 432 | 1.0468 | {'0': {'precision': 0.947172156619018, 'recall': 0.957286432160804, 'f1-score': 0.9522024367385192, 'support': 1592.0}, '1': {'precision': 0.720164609053498, 'recall': 0.6730769230769231, 'f1-score': 0.6958250497017893, 'support': 260.0}, 'accuracy': 0.9173866090712743, 'macro avg': {'precision': 0.833668382836258, 'recall': 0.8151816776188636, 'f1-score': 0.8240137432201542, 'support': 1852.0}, 'weighted avg': {'precision': 0.9153028464856297, 'recall': 0.9173866090712743, 'f1-score': 0.9162099309990215, 'support': 1852.0}} |
103
- | No log | 49.0 | 441 | 1.0710 | {'0': {'precision': 0.9473031618102914, 'recall': 0.9597989949748744, 'f1-score': 0.9535101404056162, 'support': 1592.0}, '1': {'precision': 0.7322175732217573, 'recall': 0.6730769230769231, 'f1-score': 0.7014028056112225, 'support': 260.0}, 'accuracy': 0.9195464362850972, 'macro avg': {'precision': 0.8397603675160243, 'recall': 0.8164379590258988, 'f1-score': 0.8274564730084193, 'support': 1852.0}, 'weighted avg': {'precision': 0.9171075608205405, 'recall': 0.9195464362850972, 'f1-score': 0.9181171020435523, 'support': 1852.0}} |
104
- | No log | 50.0 | 450 | 1.0671 | {'0': {'precision': 0.9478584729981379, 'recall': 0.9591708542713567, 'f1-score': 0.9534811114580081, 'support': 1592.0}, '1': {'precision': 0.7302904564315352, 'recall': 0.676923076923077, 'f1-score': 0.7025948103792415, 'support': 260.0}, 'accuracy': 0.9195464362850972, 'macro avg': {'precision': 0.8390744647148365, 'recall': 0.8180469655972169, 'f1-score': 0.8280379609186248, 'support': 1852.0}, 'weighted avg': {'precision': 0.9173143670006667, 'recall': 0.9195464362850972, 'f1-score': 0.9182594925160646, 'support': 1852.0}} |
105
 
106
 
107
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.2373
20
+ - Classification Report: {'0': {'precision': 0.9255643685173887, 'recall': 0.9528894472361809, 'f1-score': 0.9390281646549056, 'support': 1592.0}, '1': {'precision': 0.647887323943662, 'recall': 0.5307692307692308, 'f1-score': 0.5835095137420718, 'support': 260.0}, 'accuracy': 0.8936285097192225, 'macro avg': {'precision': 0.7867258462305253, 'recall': 0.7418293390027058, 'f1-score': 0.7612688391984888, 'support': 1852.0}, 'weighted avg': {'precision': 0.8865816300783126, 'recall': 0.8936285097192225, 'f1-score': 0.8891173389328015, 'support': 1852.0}}
21
 
22
  ## Model description
23
 
 
37
 
38
  The following hyperparameters were used during training:
39
  - learning_rate: 5e-06
40
+ - train_batch_size: 128
41
+ - eval_batch_size: 128
42
  - seed: 42
43
  - distributed_type: multi-GPU
44
+ - num_devices: 2
45
+ - total_train_batch_size: 256
46
+ - total_eval_batch_size: 256
47
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
+ - num_epochs: 3
50
 
51
  ### Training results
52
 
53
+ | Training Loss | Epoch | Step | Validation Loss | Classification Report |
54
+ |:-------------:|:-----:|:----:|:---------------:|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
55
+ | No log | 1.0 | 25 | 0.2794 | {'0': {'precision': 0.8953152111046848, 'recall': 0.9723618090452262, 'f1-score': 0.9322493224932249, 'support': 1592.0}, '1': {'precision': 0.6422764227642277, 'recall': 0.3038461538461538, 'f1-score': 0.412532637075718, 'support': 260.0}, 'accuracy': 0.8785097192224622, 'macro avg': {'precision': 0.7687958169344562, 'recall': 0.63810398144569, 'f1-score': 0.6723909797844715, 'support': 1852.0}, 'weighted avg': {'precision': 0.8597914071260029, 'recall': 0.8785097192224622, 'f1-score': 0.8592869368514583, 'support': 1852.0}} |
56
+ | No log | 2.0 | 50 | 0.2479 | {'0': {'precision': 0.921580547112462, 'recall': 0.9522613065326633, 'f1-score': 0.9366697559468644, 'support': 1592.0}, '1': {'precision': 0.6328502415458938, 'recall': 0.5038461538461538, 'f1-score': 0.5610278372591007, 'support': 260.0}, 'accuracy': 0.8893088552915767, 'macro avg': {'precision': 0.7772153943291779, 'recall': 0.7280537301894086, 'f1-score': 0.7488487966029825, 'support': 1852.0}, 'weighted avg': {'precision': 0.8810460549702872, 'recall': 0.8893088552915767, 'f1-score': 0.8839338494356234, 'support': 1852.0}} |
57
+ | No log | 3.0 | 75 | 0.2373 | {'0': {'precision': 0.9255643685173887, 'recall': 0.9528894472361809, 'f1-score': 0.9390281646549056, 'support': 1592.0}, '1': {'precision': 0.647887323943662, 'recall': 0.5307692307692308, 'f1-score': 0.5835095137420718, 'support': 260.0}, 'accuracy': 0.8936285097192225, 'macro avg': {'precision': 0.7867258462305253, 'recall': 0.7418293390027058, 'f1-score': 0.7612688391984888, 'support': 1852.0}, 'weighted avg': {'precision': 0.8865816300783126, 'recall': 0.8936285097192225, 'f1-score': 0.8891173389328015, 'support': 1852.0}} |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
 
59
 
60
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0163c56dfdbedf306d475bd43378e4fafe2e935de32b0292e973bf3731672682
3
  size 1583351632
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:756bd6fd8d333d77839efe73798150e98ab25c2a64256d0aec391ead6f040ee0
3
  size 1583351632