End of training
Browse files- README.md +68 -0
- model.safetensors +1 -1
- vocab.json +99 -0
README.md
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
library_name: transformers
|
| 3 |
+
tags:
|
| 4 |
+
- generated_from_trainer
|
| 5 |
+
model-index:
|
| 6 |
+
- name: char-text-reversal
|
| 7 |
+
results: []
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
| 11 |
+
should probably proofread and complete it, then remove this comment. -->
|
| 12 |
+
|
| 13 |
+
# char-text-reversal
|
| 14 |
+
|
| 15 |
+
This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
|
| 16 |
+
It achieves the following results on the evaluation set:
|
| 17 |
+
- Loss: 4.5670
|
| 18 |
+
- Char Accuracy: 0.0018
|
| 19 |
+
- Sequence Accuracy: 0.0
|
| 20 |
+
- Edit Distance: 104.6
|
| 21 |
+
|
| 22 |
+
## Model description
|
| 23 |
+
|
| 24 |
+
More information needed
|
| 25 |
+
|
| 26 |
+
## Intended uses & limitations
|
| 27 |
+
|
| 28 |
+
More information needed
|
| 29 |
+
|
| 30 |
+
## Training and evaluation data
|
| 31 |
+
|
| 32 |
+
More information needed
|
| 33 |
+
|
| 34 |
+
## Training procedure
|
| 35 |
+
|
| 36 |
+
### Training hyperparameters
|
| 37 |
+
|
| 38 |
+
The following hyperparameters were used during training:
|
| 39 |
+
- learning_rate: 0.0001
|
| 40 |
+
- train_batch_size: 256
|
| 41 |
+
- eval_batch_size: 256
|
| 42 |
+
- seed: 42
|
| 43 |
+
- optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 44 |
+
- lr_scheduler_type: linear
|
| 45 |
+
- num_epochs: 10
|
| 46 |
+
|
| 47 |
+
### Training results
|
| 48 |
+
|
| 49 |
+
| Training Loss | Epoch | Step | Validation Loss | Char Accuracy | Sequence Accuracy | Edit Distance |
|
| 50 |
+
|:-------------:|:-----:|:----:|:---------------:|:-------------:|:-----------------:|:-------------:|
|
| 51 |
+
| 4.5902 | 1.0 | 1 | 4.5933 | 0.0009 | 0.0 | 107.2 |
|
| 52 |
+
| 4.5797 | 2.0 | 2 | 4.5881 | 0.0018 | 0.0 | 106.8 |
|
| 53 |
+
| 4.5737 | 3.0 | 3 | 4.5834 | 0.0018 | 0.0 | 106.5 |
|
| 54 |
+
| 4.5672 | 4.0 | 4 | 4.5793 | 0.0018 | 0.0 | 105.9 |
|
| 55 |
+
| 4.5625 | 5.0 | 5 | 4.5758 | 0.0018 | 0.0 | 105.5 |
|
| 56 |
+
| 4.5624 | 6.0 | 6 | 4.5729 | 0.0018 | 0.0 | 105.3 |
|
| 57 |
+
| 4.554 | 7.0 | 7 | 4.5706 | 0.0018 | 0.0 | 105.0 |
|
| 58 |
+
| 4.554 | 8.0 | 8 | 4.5688 | 0.0018 | 0.0 | 104.9 |
|
| 59 |
+
| 4.5497 | 9.0 | 9 | 4.5676 | 0.0018 | 0.0 | 104.8 |
|
| 60 |
+
| 4.5493 | 10.0 | 10 | 4.5670 | 0.0018 | 0.0 | 104.6 |
|
| 61 |
+
|
| 62 |
+
|
| 63 |
+
### Framework versions
|
| 64 |
+
|
| 65 |
+
- Transformers 4.55.4
|
| 66 |
+
- Pytorch 2.8.0+cu128
|
| 67 |
+
- Datasets 4.0.0
|
| 68 |
+
- Tokenizers 0.21.4
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 1121284
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:55babd4bd75aefc2caaf16336f3966817c2ddb711196e50721700904a03dfcc9
|
| 3 |
size 1121284
|
vocab.json
ADDED
|
@@ -0,0 +1,99 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
" ": 0,
|
| 3 |
+
"!": 1,
|
| 4 |
+
"\"": 2,
|
| 5 |
+
"#": 3,
|
| 6 |
+
"$": 4,
|
| 7 |
+
"%": 5,
|
| 8 |
+
"&": 6,
|
| 9 |
+
"'": 7,
|
| 10 |
+
"(": 8,
|
| 11 |
+
")": 9,
|
| 12 |
+
"*": 10,
|
| 13 |
+
"+": 11,
|
| 14 |
+
",": 12,
|
| 15 |
+
"-": 13,
|
| 16 |
+
".": 14,
|
| 17 |
+
"/": 15,
|
| 18 |
+
"0": 16,
|
| 19 |
+
"1": 17,
|
| 20 |
+
"2": 18,
|
| 21 |
+
"3": 19,
|
| 22 |
+
"4": 20,
|
| 23 |
+
"5": 21,
|
| 24 |
+
"6": 22,
|
| 25 |
+
"7": 23,
|
| 26 |
+
"8": 24,
|
| 27 |
+
"9": 25,
|
| 28 |
+
":": 26,
|
| 29 |
+
";": 27,
|
| 30 |
+
"<": 28,
|
| 31 |
+
"=": 29,
|
| 32 |
+
">": 30,
|
| 33 |
+
"?": 31,
|
| 34 |
+
"@": 32,
|
| 35 |
+
"A": 33,
|
| 36 |
+
"B": 34,
|
| 37 |
+
"C": 35,
|
| 38 |
+
"D": 36,
|
| 39 |
+
"E": 37,
|
| 40 |
+
"F": 38,
|
| 41 |
+
"G": 39,
|
| 42 |
+
"H": 40,
|
| 43 |
+
"I": 41,
|
| 44 |
+
"J": 42,
|
| 45 |
+
"K": 43,
|
| 46 |
+
"L": 44,
|
| 47 |
+
"M": 45,
|
| 48 |
+
"N": 46,
|
| 49 |
+
"O": 47,
|
| 50 |
+
"P": 48,
|
| 51 |
+
"Q": 49,
|
| 52 |
+
"R": 50,
|
| 53 |
+
"S": 51,
|
| 54 |
+
"T": 52,
|
| 55 |
+
"U": 53,
|
| 56 |
+
"V": 54,
|
| 57 |
+
"W": 55,
|
| 58 |
+
"X": 56,
|
| 59 |
+
"Y": 57,
|
| 60 |
+
"Z": 58,
|
| 61 |
+
"[": 59,
|
| 62 |
+
"\\": 60,
|
| 63 |
+
"]": 61,
|
| 64 |
+
"^": 62,
|
| 65 |
+
"_": 63,
|
| 66 |
+
"`": 64,
|
| 67 |
+
"a": 65,
|
| 68 |
+
"b": 66,
|
| 69 |
+
"c": 67,
|
| 70 |
+
"d": 68,
|
| 71 |
+
"e": 69,
|
| 72 |
+
"f": 70,
|
| 73 |
+
"g": 71,
|
| 74 |
+
"h": 72,
|
| 75 |
+
"i": 73,
|
| 76 |
+
"j": 74,
|
| 77 |
+
"k": 75,
|
| 78 |
+
"l": 76,
|
| 79 |
+
"m": 77,
|
| 80 |
+
"n": 78,
|
| 81 |
+
"o": 79,
|
| 82 |
+
"p": 80,
|
| 83 |
+
"q": 81,
|
| 84 |
+
"r": 82,
|
| 85 |
+
"s": 83,
|
| 86 |
+
"t": 84,
|
| 87 |
+
"u": 85,
|
| 88 |
+
"v": 86,
|
| 89 |
+
"w": 87,
|
| 90 |
+
"x": 88,
|
| 91 |
+
"y": 89,
|
| 92 |
+
"z": 90,
|
| 93 |
+
"{": 91,
|
| 94 |
+
"|": 92,
|
| 95 |
+
"}": 93,
|
| 96 |
+
"~": 94,
|
| 97 |
+
"<PAD>": 95,
|
| 98 |
+
"<EOS>": 96
|
| 99 |
+
}
|