Instructions to use Sunbird/sunflower_language_classification with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Sunbird/sunflower_language_classification with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="Sunbird/sunflower_language_classification")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("Sunbird/sunflower_language_classification") model = AutoModelForSequenceClassification.from_pretrained("Sunbird/sunflower_language_classification") - Notebooks
- Google Colab
- Kaggle
End of training
Browse files- README.md +127 -0
- config.json +174 -0
- model.safetensors +3 -0
- tokenizer.json +0 -0
- tokenizer_config.json +114 -0
- training_args.bin +3 -0
README.md
ADDED
|
@@ -0,0 +1,127 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
library_name: transformers
|
| 3 |
+
license: apache-2.0
|
| 4 |
+
base_model: google/t5-efficient-tiny
|
| 5 |
+
tags:
|
| 6 |
+
- generated_from_trainer
|
| 7 |
+
metrics:
|
| 8 |
+
- accuracy
|
| 9 |
+
- precision
|
| 10 |
+
- recall
|
| 11 |
+
- f1
|
| 12 |
+
model-index:
|
| 13 |
+
- name: sunflower_language_classification_v1
|
| 14 |
+
results: []
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
| 18 |
+
should probably proofread and complete it, then remove this comment. -->
|
| 19 |
+
|
| 20 |
+
# sunflower_language_classification_v1
|
| 21 |
+
|
| 22 |
+
This model is a fine-tuned version of [google/t5-efficient-tiny](https://huggingface.co/google/t5-efficient-tiny) on the None dataset.
|
| 23 |
+
It achieves the following results on the evaluation set:
|
| 24 |
+
- Loss: 0.7212
|
| 25 |
+
- Accuracy: 0.8297
|
| 26 |
+
- Precision: 0.8471
|
| 27 |
+
- Recall: 0.8297
|
| 28 |
+
- F1: 0.8191
|
| 29 |
+
|
| 30 |
+
## Model description
|
| 31 |
+
|
| 32 |
+
More information needed
|
| 33 |
+
|
| 34 |
+
## Intended uses & limitations
|
| 35 |
+
|
| 36 |
+
More information needed
|
| 37 |
+
|
| 38 |
+
## Training and evaluation data
|
| 39 |
+
|
| 40 |
+
More information needed
|
| 41 |
+
|
| 42 |
+
## Training procedure
|
| 43 |
+
|
| 44 |
+
### Training hyperparameters
|
| 45 |
+
|
| 46 |
+
The following hyperparameters were used during training:
|
| 47 |
+
- learning_rate: 0.001
|
| 48 |
+
- train_batch_size: 64
|
| 49 |
+
- eval_batch_size: 64
|
| 50 |
+
- seed: 42
|
| 51 |
+
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 52 |
+
- lr_scheduler_type: linear
|
| 53 |
+
- lr_scheduler_warmup_steps: 10
|
| 54 |
+
- training_steps: 30000
|
| 55 |
+
|
| 56 |
+
### Training results
|
| 57 |
+
|
| 58 |
+
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
|
| 59 |
+
|:-------------:|:------:|:-----:|:---------------:|:--------:|:---------:|:------:|:------:|
|
| 60 |
+
| 2.3995 | 0.0167 | 500 | 2.0015 | 0.5145 | 0.4412 | 0.5145 | 0.4517 |
|
| 61 |
+
| 1.3282 | 0.0334 | 1000 | 1.6467 | 0.5688 | 0.4908 | 0.5688 | 0.5080 |
|
| 62 |
+
| 1.1086 | 0.0502 | 1500 | 1.5051 | 0.6304 | 0.5784 | 0.6304 | 0.5766 |
|
| 63 |
+
| 0.9882 | 0.0669 | 2000 | 1.4518 | 0.6268 | 0.6374 | 0.6268 | 0.5891 |
|
| 64 |
+
| 0.9187 | 0.0836 | 2500 | 1.3470 | 0.6522 | 0.6245 | 0.6522 | 0.6093 |
|
| 65 |
+
| 0.8546 | 0.1003 | 3000 | 1.3747 | 0.6159 | 0.5871 | 0.6159 | 0.5760 |
|
| 66 |
+
| 0.8214 | 0.1170 | 3500 | 1.2708 | 0.6703 | 0.6316 | 0.6703 | 0.6323 |
|
| 67 |
+
| 0.7843 | 0.1338 | 4000 | 1.1659 | 0.6848 | 0.6639 | 0.6848 | 0.6461 |
|
| 68 |
+
| 0.7470 | 0.1505 | 4500 | 1.1969 | 0.6848 | 0.6534 | 0.6848 | 0.6491 |
|
| 69 |
+
| 0.7299 | 0.1672 | 5000 | 1.0592 | 0.7101 | 0.7030 | 0.7101 | 0.6748 |
|
| 70 |
+
| 0.7041 | 0.1839 | 5500 | 1.0536 | 0.6848 | 0.6728 | 0.6848 | 0.6534 |
|
| 71 |
+
| 0.6755 | 0.2006 | 6000 | 1.0265 | 0.7138 | 0.7298 | 0.7138 | 0.6852 |
|
| 72 |
+
| 0.6683 | 0.2174 | 6500 | 1.0049 | 0.7428 | 0.7403 | 0.7428 | 0.7089 |
|
| 73 |
+
| 0.6573 | 0.2341 | 7000 | 1.0702 | 0.7029 | 0.7052 | 0.7029 | 0.6764 |
|
| 74 |
+
| 0.6372 | 0.2508 | 7500 | 1.0260 | 0.7210 | 0.7143 | 0.7210 | 0.6998 |
|
| 75 |
+
| 0.6173 | 0.2675 | 8000 | 0.9654 | 0.7428 | 0.7492 | 0.7428 | 0.7141 |
|
| 76 |
+
| 0.6009 | 0.2842 | 8500 | 1.0185 | 0.7464 | 0.7504 | 0.7464 | 0.7167 |
|
| 77 |
+
| 0.5924 | 0.3010 | 9000 | 1.0028 | 0.7283 | 0.7652 | 0.7283 | 0.7052 |
|
| 78 |
+
| 0.5916 | 0.3177 | 9500 | 0.9581 | 0.7174 | 0.7217 | 0.7174 | 0.6893 |
|
| 79 |
+
| 0.5806 | 0.3344 | 10000 | 1.0011 | 0.7355 | 0.7618 | 0.7355 | 0.7149 |
|
| 80 |
+
| 0.5672 | 0.3511 | 10500 | 0.8978 | 0.7572 | 0.7429 | 0.7572 | 0.7307 |
|
| 81 |
+
| 0.5580 | 0.3678 | 11000 | 0.9525 | 0.7210 | 0.7308 | 0.7210 | 0.7013 |
|
| 82 |
+
| 0.5520 | 0.3846 | 11500 | 0.8647 | 0.7645 | 0.7695 | 0.7645 | 0.7391 |
|
| 83 |
+
| 0.5552 | 0.4013 | 12000 | 0.8977 | 0.7536 | 0.7698 | 0.7536 | 0.7358 |
|
| 84 |
+
| 0.5341 | 0.4180 | 12500 | 0.8526 | 0.7536 | 0.7625 | 0.7536 | 0.7305 |
|
| 85 |
+
| 0.5284 | 0.4347 | 13000 | 0.8496 | 0.7464 | 0.7310 | 0.7464 | 0.7166 |
|
| 86 |
+
| 0.5322 | 0.4514 | 13500 | 0.7672 | 0.8007 | 0.8006 | 0.8007 | 0.7827 |
|
| 87 |
+
| 0.5229 | 0.4681 | 14000 | 0.8253 | 0.7754 | 0.7698 | 0.7754 | 0.7515 |
|
| 88 |
+
| 0.5007 | 0.4849 | 14500 | 0.8496 | 0.7826 | 0.7649 | 0.7826 | 0.7547 |
|
| 89 |
+
| 0.5109 | 0.5016 | 15000 | 0.7700 | 0.7754 | 0.7767 | 0.7754 | 0.7518 |
|
| 90 |
+
| 0.4989 | 0.5183 | 15500 | 0.8338 | 0.7645 | 0.7741 | 0.7645 | 0.7419 |
|
| 91 |
+
| 0.4991 | 0.5350 | 16000 | 0.7927 | 0.7754 | 0.7928 | 0.7754 | 0.7625 |
|
| 92 |
+
| 0.4977 | 0.5517 | 16500 | 0.7859 | 0.7790 | 0.7670 | 0.7790 | 0.7551 |
|
| 93 |
+
| 0.4854 | 0.5685 | 17000 | 0.7915 | 0.7862 | 0.7907 | 0.7862 | 0.7630 |
|
| 94 |
+
| 0.4826 | 0.5852 | 17500 | 0.7628 | 0.8043 | 0.7964 | 0.8043 | 0.7846 |
|
| 95 |
+
| 0.4765 | 0.6019 | 18000 | 0.7632 | 0.7971 | 0.8008 | 0.7971 | 0.7791 |
|
| 96 |
+
| 0.4641 | 0.6186 | 18500 | 0.7722 | 0.7935 | 0.7660 | 0.7935 | 0.7670 |
|
| 97 |
+
| 0.4783 | 0.6353 | 19000 | 0.7046 | 0.7899 | 0.8111 | 0.7899 | 0.7773 |
|
| 98 |
+
| 0.4745 | 0.6521 | 19500 | 0.7342 | 0.7899 | 0.8044 | 0.7899 | 0.7726 |
|
| 99 |
+
| 0.4555 | 0.6688 | 20000 | 0.7116 | 0.7862 | 0.7853 | 0.7862 | 0.7662 |
|
| 100 |
+
| 0.4530 | 0.6855 | 20500 | 0.7385 | 0.7754 | 0.7658 | 0.7754 | 0.7557 |
|
| 101 |
+
| 0.4565 | 0.7022 | 21000 | 0.7651 | 0.7899 | 0.8132 | 0.7899 | 0.7770 |
|
| 102 |
+
| 0.4555 | 0.7189 | 21500 | 0.7902 | 0.7681 | 0.7812 | 0.7681 | 0.7569 |
|
| 103 |
+
| 0.4485 | 0.7357 | 22000 | 0.7613 | 0.7862 | 0.7962 | 0.7862 | 0.7686 |
|
| 104 |
+
| 0.4518 | 0.7524 | 22500 | 0.7544 | 0.7862 | 0.7944 | 0.7862 | 0.7676 |
|
| 105 |
+
| 0.4508 | 0.7691 | 23000 | 0.7296 | 0.8043 | 0.8110 | 0.8043 | 0.7907 |
|
| 106 |
+
| 0.4418 | 0.7858 | 23500 | 0.7293 | 0.8261 | 0.8527 | 0.8261 | 0.8137 |
|
| 107 |
+
| 0.4365 | 0.8025 | 24000 | 0.7370 | 0.8043 | 0.8217 | 0.8043 | 0.7928 |
|
| 108 |
+
| 0.4353 | 0.8193 | 24500 | 0.7100 | 0.8188 | 0.8274 | 0.8188 | 0.8049 |
|
| 109 |
+
| 0.4240 | 0.8360 | 25000 | 0.7273 | 0.7862 | 0.7857 | 0.7862 | 0.7697 |
|
| 110 |
+
| 0.4205 | 0.8527 | 25500 | 0.7297 | 0.8225 | 0.8351 | 0.8225 | 0.8059 |
|
| 111 |
+
| 0.4316 | 0.8694 | 26000 | 0.7204 | 0.8116 | 0.8066 | 0.8116 | 0.7911 |
|
| 112 |
+
| 0.4176 | 0.8861 | 26500 | 0.7340 | 0.8080 | 0.8184 | 0.8080 | 0.7922 |
|
| 113 |
+
| 0.4240 | 0.9029 | 27000 | 0.7298 | 0.8116 | 0.8223 | 0.8116 | 0.7964 |
|
| 114 |
+
| 0.4149 | 0.9196 | 27500 | 0.7410 | 0.8188 | 0.8185 | 0.8188 | 0.8023 |
|
| 115 |
+
| 0.4159 | 0.9363 | 28000 | 0.7303 | 0.8152 | 0.8388 | 0.8152 | 0.8069 |
|
| 116 |
+
| 0.4068 | 0.9530 | 28500 | 0.7220 | 0.8043 | 0.8209 | 0.8043 | 0.7955 |
|
| 117 |
+
| 0.4135 | 0.9697 | 29000 | 0.7313 | 0.8188 | 0.8238 | 0.8188 | 0.8055 |
|
| 118 |
+
| 0.4130 | 0.9865 | 29500 | 0.7221 | 0.8225 | 0.8320 | 0.8225 | 0.8095 |
|
| 119 |
+
| 0.4213 | 1.0032 | 30000 | 0.7212 | 0.8297 | 0.8471 | 0.8297 | 0.8191 |
|
| 120 |
+
|
| 121 |
+
|
| 122 |
+
### Framework versions
|
| 123 |
+
|
| 124 |
+
- Transformers 5.8.0
|
| 125 |
+
- Pytorch 2.11.0+cu130
|
| 126 |
+
- Datasets 4.8.5
|
| 127 |
+
- Tokenizers 0.22.2
|
config.json
ADDED
|
@@ -0,0 +1,174 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"T5ForSequenceClassification"
|
| 4 |
+
],
|
| 5 |
+
"classifier_dropout": 0.0,
|
| 6 |
+
"d_ff": 1024,
|
| 7 |
+
"d_kv": 64,
|
| 8 |
+
"d_model": 256,
|
| 9 |
+
"decoder_start_token_id": 0,
|
| 10 |
+
"dense_act_fn": "relu",
|
| 11 |
+
"dropout_rate": 0.1,
|
| 12 |
+
"dtype": "float32",
|
| 13 |
+
"eos_token_id": 1,
|
| 14 |
+
"feed_forward_proj": "relu",
|
| 15 |
+
"id2label": {
|
| 16 |
+
"0": "ach",
|
| 17 |
+
"1": "adh",
|
| 18 |
+
"2": "afr",
|
| 19 |
+
"3": "aka",
|
| 20 |
+
"4": "alz",
|
| 21 |
+
"5": "amh",
|
| 22 |
+
"6": "bam",
|
| 23 |
+
"7": "bem",
|
| 24 |
+
"8": "ber",
|
| 25 |
+
"9": "bfa",
|
| 26 |
+
"10": "cgg",
|
| 27 |
+
"11": "dag",
|
| 28 |
+
"12": "dga",
|
| 29 |
+
"13": "din",
|
| 30 |
+
"14": "eng",
|
| 31 |
+
"15": "ewe",
|
| 32 |
+
"16": "fra",
|
| 33 |
+
"17": "ful",
|
| 34 |
+
"18": "gwr",
|
| 35 |
+
"19": "hau",
|
| 36 |
+
"20": "ibo",
|
| 37 |
+
"21": "ikx",
|
| 38 |
+
"22": "kab",
|
| 39 |
+
"23": "kau",
|
| 40 |
+
"24": "kdi",
|
| 41 |
+
"25": "kdj",
|
| 42 |
+
"26": "keo",
|
| 43 |
+
"27": "kik",
|
| 44 |
+
"28": "kin",
|
| 45 |
+
"29": "koo",
|
| 46 |
+
"30": "kpz",
|
| 47 |
+
"31": "laj",
|
| 48 |
+
"32": "led",
|
| 49 |
+
"33": "lgg",
|
| 50 |
+
"34": "lin",
|
| 51 |
+
"35": "lsm",
|
| 52 |
+
"36": "luc",
|
| 53 |
+
"37": "lug",
|
| 54 |
+
"38": "luo",
|
| 55 |
+
"39": "luy",
|
| 56 |
+
"40": "mhi",
|
| 57 |
+
"41": "mlg",
|
| 58 |
+
"42": "myx",
|
| 59 |
+
"43": "nbl",
|
| 60 |
+
"44": "nuj",
|
| 61 |
+
"45": "nya",
|
| 62 |
+
"46": "nyn",
|
| 63 |
+
"47": "nyo",
|
| 64 |
+
"48": "orm",
|
| 65 |
+
"49": "pcm",
|
| 66 |
+
"50": "pok",
|
| 67 |
+
"51": "rub",
|
| 68 |
+
"52": "ruc",
|
| 69 |
+
"53": "run",
|
| 70 |
+
"54": "rwm",
|
| 71 |
+
"55": "sna",
|
| 72 |
+
"56": "som",
|
| 73 |
+
"57": "sot",
|
| 74 |
+
"58": "swa",
|
| 75 |
+
"59": "teo",
|
| 76 |
+
"60": "tlj",
|
| 77 |
+
"61": "tsn",
|
| 78 |
+
"62": "ttj",
|
| 79 |
+
"63": "wol",
|
| 80 |
+
"64": "xho",
|
| 81 |
+
"65": "xog",
|
| 82 |
+
"66": "yor",
|
| 83 |
+
"67": "zul"
|
| 84 |
+
},
|
| 85 |
+
"initializer_factor": 1.0,
|
| 86 |
+
"is_decoder": false,
|
| 87 |
+
"is_encoder_decoder": true,
|
| 88 |
+
"is_gated_act": false,
|
| 89 |
+
"label2id": {
|
| 90 |
+
"ach": 0,
|
| 91 |
+
"adh": 1,
|
| 92 |
+
"afr": 2,
|
| 93 |
+
"aka": 3,
|
| 94 |
+
"alz": 4,
|
| 95 |
+
"amh": 5,
|
| 96 |
+
"bam": 6,
|
| 97 |
+
"bem": 7,
|
| 98 |
+
"ber": 8,
|
| 99 |
+
"bfa": 9,
|
| 100 |
+
"cgg": 10,
|
| 101 |
+
"dag": 11,
|
| 102 |
+
"dga": 12,
|
| 103 |
+
"din": 13,
|
| 104 |
+
"eng": 14,
|
| 105 |
+
"ewe": 15,
|
| 106 |
+
"fra": 16,
|
| 107 |
+
"ful": 17,
|
| 108 |
+
"gwr": 18,
|
| 109 |
+
"hau": 19,
|
| 110 |
+
"ibo": 20,
|
| 111 |
+
"ikx": 21,
|
| 112 |
+
"kab": 22,
|
| 113 |
+
"kau": 23,
|
| 114 |
+
"kdi": 24,
|
| 115 |
+
"kdj": 25,
|
| 116 |
+
"keo": 26,
|
| 117 |
+
"kik": 27,
|
| 118 |
+
"kin": 28,
|
| 119 |
+
"koo": 29,
|
| 120 |
+
"kpz": 30,
|
| 121 |
+
"laj": 31,
|
| 122 |
+
"led": 32,
|
| 123 |
+
"lgg": 33,
|
| 124 |
+
"lin": 34,
|
| 125 |
+
"lsm": 35,
|
| 126 |
+
"luc": 36,
|
| 127 |
+
"lug": 37,
|
| 128 |
+
"luo": 38,
|
| 129 |
+
"luy": 39,
|
| 130 |
+
"mhi": 40,
|
| 131 |
+
"mlg": 41,
|
| 132 |
+
"myx": 42,
|
| 133 |
+
"nbl": 43,
|
| 134 |
+
"nuj": 44,
|
| 135 |
+
"nya": 45,
|
| 136 |
+
"nyn": 46,
|
| 137 |
+
"nyo": 47,
|
| 138 |
+
"orm": 48,
|
| 139 |
+
"pcm": 49,
|
| 140 |
+
"pok": 50,
|
| 141 |
+
"rub": 51,
|
| 142 |
+
"ruc": 52,
|
| 143 |
+
"run": 53,
|
| 144 |
+
"rwm": 54,
|
| 145 |
+
"sna": 55,
|
| 146 |
+
"som": 56,
|
| 147 |
+
"sot": 57,
|
| 148 |
+
"swa": 58,
|
| 149 |
+
"teo": 59,
|
| 150 |
+
"tlj": 60,
|
| 151 |
+
"tsn": 61,
|
| 152 |
+
"ttj": 62,
|
| 153 |
+
"wol": 63,
|
| 154 |
+
"xho": 64,
|
| 155 |
+
"xog": 65,
|
| 156 |
+
"yor": 66,
|
| 157 |
+
"zul": 67
|
| 158 |
+
},
|
| 159 |
+
"layer_norm_epsilon": 1e-06,
|
| 160 |
+
"model_type": "t5",
|
| 161 |
+
"n_positions": 512,
|
| 162 |
+
"num_decoder_layers": 4,
|
| 163 |
+
"num_heads": 4,
|
| 164 |
+
"num_layers": 4,
|
| 165 |
+
"pad_token_id": 0,
|
| 166 |
+
"problem_type": "single_label_classification",
|
| 167 |
+
"relative_attention_max_distance": 128,
|
| 168 |
+
"relative_attention_num_buckets": 32,
|
| 169 |
+
"scale_decoder_outputs": true,
|
| 170 |
+
"tie_word_embeddings": true,
|
| 171 |
+
"transformers_version": "5.8.0",
|
| 172 |
+
"use_cache": false,
|
| 173 |
+
"vocab_size": 32128
|
| 174 |
+
}
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2f2e5888ff7741e36c1f60cf0894267db00e3f69abbc7f540bc45918531dd55e
|
| 3 |
+
size 62627616
|
tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1,114 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"backend": "tokenizers",
|
| 3 |
+
"eos_token": "</s>",
|
| 4 |
+
"extra_ids": 100,
|
| 5 |
+
"extra_special_tokens": [
|
| 6 |
+
"<extra_id_0>",
|
| 7 |
+
"<extra_id_1>",
|
| 8 |
+
"<extra_id_2>",
|
| 9 |
+
"<extra_id_3>",
|
| 10 |
+
"<extra_id_4>",
|
| 11 |
+
"<extra_id_5>",
|
| 12 |
+
"<extra_id_6>",
|
| 13 |
+
"<extra_id_7>",
|
| 14 |
+
"<extra_id_8>",
|
| 15 |
+
"<extra_id_9>",
|
| 16 |
+
"<extra_id_10>",
|
| 17 |
+
"<extra_id_11>",
|
| 18 |
+
"<extra_id_12>",
|
| 19 |
+
"<extra_id_13>",
|
| 20 |
+
"<extra_id_14>",
|
| 21 |
+
"<extra_id_15>",
|
| 22 |
+
"<extra_id_16>",
|
| 23 |
+
"<extra_id_17>",
|
| 24 |
+
"<extra_id_18>",
|
| 25 |
+
"<extra_id_19>",
|
| 26 |
+
"<extra_id_20>",
|
| 27 |
+
"<extra_id_21>",
|
| 28 |
+
"<extra_id_22>",
|
| 29 |
+
"<extra_id_23>",
|
| 30 |
+
"<extra_id_24>",
|
| 31 |
+
"<extra_id_25>",
|
| 32 |
+
"<extra_id_26>",
|
| 33 |
+
"<extra_id_27>",
|
| 34 |
+
"<extra_id_28>",
|
| 35 |
+
"<extra_id_29>",
|
| 36 |
+
"<extra_id_30>",
|
| 37 |
+
"<extra_id_31>",
|
| 38 |
+
"<extra_id_32>",
|
| 39 |
+
"<extra_id_33>",
|
| 40 |
+
"<extra_id_34>",
|
| 41 |
+
"<extra_id_35>",
|
| 42 |
+
"<extra_id_36>",
|
| 43 |
+
"<extra_id_37>",
|
| 44 |
+
"<extra_id_38>",
|
| 45 |
+
"<extra_id_39>",
|
| 46 |
+
"<extra_id_40>",
|
| 47 |
+
"<extra_id_41>",
|
| 48 |
+
"<extra_id_42>",
|
| 49 |
+
"<extra_id_43>",
|
| 50 |
+
"<extra_id_44>",
|
| 51 |
+
"<extra_id_45>",
|
| 52 |
+
"<extra_id_46>",
|
| 53 |
+
"<extra_id_47>",
|
| 54 |
+
"<extra_id_48>",
|
| 55 |
+
"<extra_id_49>",
|
| 56 |
+
"<extra_id_50>",
|
| 57 |
+
"<extra_id_51>",
|
| 58 |
+
"<extra_id_52>",
|
| 59 |
+
"<extra_id_53>",
|
| 60 |
+
"<extra_id_54>",
|
| 61 |
+
"<extra_id_55>",
|
| 62 |
+
"<extra_id_56>",
|
| 63 |
+
"<extra_id_57>",
|
| 64 |
+
"<extra_id_58>",
|
| 65 |
+
"<extra_id_59>",
|
| 66 |
+
"<extra_id_60>",
|
| 67 |
+
"<extra_id_61>",
|
| 68 |
+
"<extra_id_62>",
|
| 69 |
+
"<extra_id_63>",
|
| 70 |
+
"<extra_id_64>",
|
| 71 |
+
"<extra_id_65>",
|
| 72 |
+
"<extra_id_66>",
|
| 73 |
+
"<extra_id_67>",
|
| 74 |
+
"<extra_id_68>",
|
| 75 |
+
"<extra_id_69>",
|
| 76 |
+
"<extra_id_70>",
|
| 77 |
+
"<extra_id_71>",
|
| 78 |
+
"<extra_id_72>",
|
| 79 |
+
"<extra_id_73>",
|
| 80 |
+
"<extra_id_74>",
|
| 81 |
+
"<extra_id_75>",
|
| 82 |
+
"<extra_id_76>",
|
| 83 |
+
"<extra_id_77>",
|
| 84 |
+
"<extra_id_78>",
|
| 85 |
+
"<extra_id_79>",
|
| 86 |
+
"<extra_id_80>",
|
| 87 |
+
"<extra_id_81>",
|
| 88 |
+
"<extra_id_82>",
|
| 89 |
+
"<extra_id_83>",
|
| 90 |
+
"<extra_id_84>",
|
| 91 |
+
"<extra_id_85>",
|
| 92 |
+
"<extra_id_86>",
|
| 93 |
+
"<extra_id_87>",
|
| 94 |
+
"<extra_id_88>",
|
| 95 |
+
"<extra_id_89>",
|
| 96 |
+
"<extra_id_90>",
|
| 97 |
+
"<extra_id_91>",
|
| 98 |
+
"<extra_id_92>",
|
| 99 |
+
"<extra_id_93>",
|
| 100 |
+
"<extra_id_94>",
|
| 101 |
+
"<extra_id_95>",
|
| 102 |
+
"<extra_id_96>",
|
| 103 |
+
"<extra_id_97>",
|
| 104 |
+
"<extra_id_98>",
|
| 105 |
+
"<extra_id_99>"
|
| 106 |
+
],
|
| 107 |
+
"is_local": false,
|
| 108 |
+
"local_files_only": false,
|
| 109 |
+
"model_max_length": 1000000000000000019884624838656,
|
| 110 |
+
"pad_token": "<pad>",
|
| 111 |
+
"sp_model_kwargs": {},
|
| 112 |
+
"tokenizer_class": "T5Tokenizer",
|
| 113 |
+
"unk_token": "<unk>"
|
| 114 |
+
}
|
training_args.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a4e3aa797123ac167307658f133886f118137de20d3071fe677d42112b37086d
|
| 3 |
+
size 5329
|