edloginovad commited on
Commit
37ffc75
·
verified ·
1 Parent(s): 42bfbab

Upload PyTorch model

Browse files
README.md CHANGED
@@ -1,174 +1,72 @@
1
  ---
2
- license: other
3
- base_model: DedalusHealthCare/tinybert-mlm-en
4
- datasets:
5
- - DedalusHealthCare/ner_demo_en
6
- task_categories:
7
- - token-classification
8
- task_ids:
9
- - named-entity-recognition
10
  language:
11
  - en
 
 
12
  tags:
13
- - token-classification
14
- - ner
15
- - named-entity-recognition
16
- - en
17
- - disorder_finding
18
- library_name: transformers
19
- pipeline_tag: token-classification
20
  ---
21
 
22
- # TinyBERT for Demo NER (English)
23
-
24
- ## Model Description
25
-
26
- This model is a fine-tuned TinyBERT model for Named Entity Recognition (NER) of DISORDER_FINDING entities in English medical texts.
27
-
28
- It was fine-tuned from the [DedalusHealthCare/tinybert-mlm-en](https://huggingface.co/DedalusHealthCare/tinybert-mlm-en) masked language model using the [DedalusHealthCare/ner_demo_en](https://huggingface.co/datasets/DedalusHealthCare/ner_demo_en) dataset.
29
-
30
- **Base Model**: [DedalusHealthCare/tinybert-mlm-en](https://huggingface.co/DedalusHealthCare/tinybert-mlm-en)
31
-
32
- **Training Dataset**: [DedalusHealthCare/ner_demo_en](https://huggingface.co/datasets/DedalusHealthCare/ner_demo_en)
33
-
34
- **Task**: Token Classification (Named Entity Recognition)
35
-
36
- **Language**: English (en)
37
-
38
- **Entities**: DISORDER_FINDING
39
-
40
- **Model Format**: PYTORCH
41
-
42
- **Please use `max` as aggregation strategy in the NER pipeline (see example below)**.
43
-
44
- ## Training Details
45
-
46
- - **Training epochs**: 1
47
- - **Learning rate**: 5e-05
48
- - **Training batch size**: 32
49
- - **Evaluation batch size**: 32
50
- - **Max sequence length**: 256
51
- - **Warmup ratio**: 0.1
52
- - **Weight decay**: 0.01
53
- - **FP16**: True
54
- - **Gradient accumulation steps**: 2
55
- - **Save steps**: 50000
56
- - **Evaluation steps**: 50000
57
- - **Evaluation strategy**: steps
58
- - **Random seed**: 1
59
- - **Label all tokens**: True
60
- - **Balanced training**: False
61
- - **Chunk mode**: sliding_window
62
- - **Stride**: 16
63
- - **Max training samples**: None
64
- - **Max evaluation samples**: None
65
- - **Early stopping patience**: 0
66
- - **Early stopping threshold**: 0.0
67
-
68
-
69
- ### Build Information
70
- - **Git Commit**: [9583c80](https://github.com/Dedalus-clinalytix/prod/commit/9583c80da9b9567b72c69d953854871a9badc139)
71
-
72
- ## Use Case Configuration
73
-
74
- - **Use case name**: demo
75
- - **Language**: English (en)
76
- - **Target entities**: DISORDER_FINDING
77
- - **Text processing max length**: N/A
78
- - **Entity labeling scheme**: N/A
79
-
80
- ## Usage
81
-
82
- ### Using Transformers Pipeline
83
-
84
- ```python
85
- from transformers import pipeline
86
-
87
- # Load the model
88
- ner_pipeline = pipeline(
89
- "ner",
90
- model="DedalusHealthCare/tinybert-ner-demo-en",
91
- tokenizer="DedalusHealthCare/tinybert-ner-demo-en",
92
- aggregation_strategy="max"
93
- )
94
-
95
- # Example text
96
- text = "Der Patient hat Diabetes und Bluthochdruck."
97
-
98
- # Get predictions
99
- entities = ner_pipeline(text)
100
- print(entities)
101
- ```
102
-
103
- ### Using AutoModel and AutoTokenizer
104
-
105
- ```python
106
- from transformers import AutoTokenizer, AutoModelForTokenClassification
107
- import torch
108
-
109
- # Load model and tokenizer
110
- model_name = "DedalusHealthCare/tinybert-ner-demo-en"
111
- tokenizer = AutoTokenizer.from_pretrained(model_name)
112
- model = AutoModelForTokenClassification.from_pretrained(model_name)
113
-
114
- # Tokenize text
115
- text = "Der Patient hat Diabetes und Bluthochdruck."
116
- tokens = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
117
 
118
- # Get predictions
119
- with torch.no_grad():
120
- outputs = model(**tokens)
121
- predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
122
 
123
- # Get labels
124
- predicted_token_class_ids = predictions.argmax(-1)
125
- labels = [model.config.id2label[id.item()] for id in predicted_token_class_ids[0]]
126
- ```
 
 
 
 
 
 
 
127
 
128
- ## Model Architecture
129
 
130
- This model is based on the TinyBERT architecture with a token classification head for Named Entity Recognition.
131
 
132
- ## Intended Use
133
 
134
- This model is intended for:
135
- - Named Entity Recognition in English medical texts
136
- - Identification of DISORDER_FINDING entities
137
- - Medical text processing and analysis
138
- - Research and development in medical NLP
139
 
140
- ## Limitations
141
 
142
- - Trained specifically for English medical texts
143
- - Performance may vary on texts from different medical domains
144
- - May not generalize well to non-medical texts
145
- - Requires careful evaluation on new datasets
146
-
147
- ## Ethical Considerations
148
 
149
- - This model is trained on medical data and should be used responsibly
150
- - Outputs should be validated by medical professionals
151
- - Patient privacy and data protection regulations must be followed
152
- - The model may have biases present in the training data
153
 
154
- ## Citation
155
 
156
- If you use this model, please cite:
 
 
 
 
 
 
 
 
 
 
 
 
157
 
158
- ```bibtex
159
- @model{demo_en_ner_model,
160
- title = {TinyBERT for Demo NER (English)},
161
- author = {DH Healthcare GmbH},
162
- year = {2025},
163
- publisher = {Hugging Face},
164
- url = {https://huggingface.co/DedalusHealthCare/tinybert-ner-demo-en}
165
- }
166
- ```
167
 
168
- ## License
169
 
170
- This model is proprietary and owned by DH Healthcare GmbH. All rights reserved.
171
 
172
- ## Contact
173
 
174
- For questions or support, please contact DH Healthcare GmbH.
 
 
 
 
1
  ---
2
+ library_name: transformers
 
 
 
 
 
 
 
3
  language:
4
  - en
5
+ license: other
6
+ base_model: DedalusHealthCare/tinybert-mlm-en
7
  tags:
8
+ - generated_from_trainer
9
+ datasets:
10
+ - demo-en
11
+ model-index:
12
+ - name: tinybert-clinalytix_1774257349
13
+ results: []
 
14
  ---
15
 
16
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
+ should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
+ # tinybert-clinalytix_1774257349
 
 
 
20
 
21
+ This model is a fine-tuned version of [DedalusHealthCare/tinybert-mlm-en](https://huggingface.co/DedalusHealthCare/tinybert-mlm-en) on the demo-en dataset.
22
+ It achieves the following results on the evaluation set:
23
+ - Loss: 1.2915
24
+ - Disorder Finding Precision: 0.0667
25
+ - Disorder Finding Recall: 0.4
26
+ - Disorder Finding F1: 0.1143
27
+ - Disorder Finding Number: 5
28
+ - Overall Precision: 0.0667
29
+ - Overall Recall: 0.4
30
+ - Overall F1: 0.1143
31
+ - Overall Accuracy: 0.2222
32
 
33
+ ## Model description
34
 
35
+ More information needed
36
 
37
+ ## Intended uses & limitations
38
 
39
+ More information needed
 
 
 
 
40
 
41
+ ## Training and evaluation data
42
 
43
+ More information needed
 
 
 
 
 
44
 
45
+ ## Training procedure
 
 
 
46
 
47
+ ### Training hyperparameters
48
 
49
+ The following hyperparameters were used during training:
50
+ - learning_rate: 5e-05
51
+ - train_batch_size: 32
52
+ - eval_batch_size: 32
53
+ - seed: 1
54
+ - gradient_accumulation_steps: 2
55
+ - total_train_batch_size: 64
56
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
57
+ - lr_scheduler_type: linear
58
+ - lr_scheduler_warmup_ratio: 0.1
59
+ - num_epochs: 1
60
+ - mixed_precision_training: Native AMP
61
+ - label_smoothing_factor: 0.1
62
 
63
+ ### Training results
 
 
 
 
 
 
 
 
64
 
 
65
 
 
66
 
67
+ ### Framework versions
68
 
69
+ - Transformers 4.45.1
70
+ - Pytorch 2.6.0+cu124
71
+ - Datasets 4.5.0
72
+ - Tokenizers 0.20.3
all_results.json ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.0,
3
+ "eval_DISORDER_FINDING_f1": 0.1142857142857143,
4
+ "eval_DISORDER_FINDING_number": 5,
5
+ "eval_DISORDER_FINDING_precision": 0.06666666666666667,
6
+ "eval_DISORDER_FINDING_recall": 0.4,
7
+ "eval_loss": 1.2915407419204712,
8
+ "eval_overall_accuracy": 0.2222222222222222,
9
+ "eval_overall_f1": 0.1142857142857143,
10
+ "eval_overall_precision": 0.06666666666666667,
11
+ "eval_overall_recall": 0.4,
12
+ "eval_runtime": 0.1761,
13
+ "eval_samples": 3,
14
+ "eval_samples_per_second": 17.036,
15
+ "eval_steps_per_second": 5.679,
16
+ "total_flos": 6768861120.0,
17
+ "train_loss": 0.6272501945495605,
18
+ "train_runtime": 0.752,
19
+ "train_samples": 15,
20
+ "train_samples_per_second": 19.947,
21
+ "train_steps_per_second": 1.33
22
+ }
checkpoint-1/added_tokens.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "<HCW>": 28997,
3
+ "<HOSPITAL>": 29001,
4
+ "<ID>": 28998,
5
+ "<PATIENT>": 28999,
6
+ "<PHONE>": 29000,
7
+ "<VENDOR>": 28996
8
+ }
checkpoint-1/config.json ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "DedalusHealthCare/tinybert-mlm-en",
3
+ "architectures": [
4
+ "BertForTokenClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "finetuning_task": "ner",
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 312,
12
+ "id2label": {
13
+ "0": "B-DISORDER_FINDING",
14
+ "1": "I-DISORDER_FINDING",
15
+ "2": "O"
16
+ },
17
+ "initializer_range": 0.02,
18
+ "intermediate_size": 312,
19
+ "label2id": {
20
+ "B-DISORDER_FINDING": 0,
21
+ "I-DISORDER_FINDING": 1,
22
+ "O": 2
23
+ },
24
+ "layer_norm_eps": 1e-12,
25
+ "max_position_embeddings": 512,
26
+ "model_type": "bert",
27
+ "num_attention_heads": 12,
28
+ "num_hidden_layers": 4,
29
+ "pad_token_id": 0,
30
+ "position_embedding_type": "absolute",
31
+ "pre_trained": "",
32
+ "torch_dtype": "float32",
33
+ "training": "",
34
+ "transformers_version": "4.45.1",
35
+ "type_vocab_size": 2,
36
+ "use_cache": true,
37
+ "vocab_size": 29002
38
+ }
checkpoint-1/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:339d6ee169f5da7bb201e6f7760634b8742e3c597dc4bb80ea3098553b6ceffd
3
+ size 46245276
checkpoint-1/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cdd91f5ea1aa6cc2cc7776d3a2e69c19bdcf8db81e7867e9ab8d20b5887ed348
3
+ size 92529274
checkpoint-1/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a94e039ef0bf81207f4a58934eca2a9edc72deeb862f2c741f0a6bb4ea9a124
3
+ size 13990
checkpoint-1/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cde8705bb323f5fe23427d88f7466e6183420613d541a4cbb3a30eb1a9de7c0c
3
+ size 1064
checkpoint-1/special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
checkpoint-1/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
checkpoint-1/tokenizer_config.json ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "28996": {
44
+ "content": "<VENDOR>",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "28997": {
52
+ "content": "<HCW>",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "28998": {
60
+ "content": "<ID>",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "28999": {
68
+ "content": "<PATIENT>",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "29000": {
76
+ "content": "<PHONE>",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "29001": {
84
+ "content": "<HOSPITAL>",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ }
91
+ },
92
+ "clean_up_tokenization_spaces": true,
93
+ "cls_token": "[CLS]",
94
+ "do_lower_case": true,
95
+ "mask_token": "[MASK]",
96
+ "max_length": 256,
97
+ "model_max_length": 1000000000000000019884624838656,
98
+ "pad_to_multiple_of": null,
99
+ "pad_token": "[PAD]",
100
+ "pad_token_type_id": 0,
101
+ "padding_side": "right",
102
+ "sep_token": "[SEP]",
103
+ "stride": 0,
104
+ "strip_accents": true,
105
+ "tokenize_chinese_chars": true,
106
+ "tokenizer_class": "BertTokenizer",
107
+ "truncation_side": "right",
108
+ "truncation_strategy": "longest_first",
109
+ "unk_token": "[UNK]"
110
+ }
checkpoint-1/trainer_state.json ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 1.0,
5
+ "eval_steps": 50000,
6
+ "global_step": 1,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 1.0,
13
+ "grad_norm": 3.052371025085449,
14
+ "learning_rate": 0.0,
15
+ "loss": 0.6273,
16
+ "step": 1
17
+ }
18
+ ],
19
+ "logging_steps": 10,
20
+ "max_steps": 1,
21
+ "num_input_tokens_seen": 0,
22
+ "num_train_epochs": 1,
23
+ "save_steps": 50000,
24
+ "stateful_callbacks": {
25
+ "TrainerControl": {
26
+ "args": {
27
+ "should_epoch_stop": false,
28
+ "should_evaluate": false,
29
+ "should_log": false,
30
+ "should_save": true,
31
+ "should_training_stop": true
32
+ },
33
+ "attributes": {}
34
+ }
35
+ },
36
+ "total_flos": 6768861120.0,
37
+ "train_batch_size": 32,
38
+ "trial_name": null,
39
+ "trial_params": null
40
+ }
checkpoint-1/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e6928829f04912d587f1e09fe2070d2f7f0a822bd021df29fa70f4cb8712e39b
3
+ size 5368
checkpoint-1/vocab.txt ADDED
The diff for this file is too large to render. See raw diff
 
eval_results.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.0,
3
+ "eval_DISORDER_FINDING_f1": 0.1142857142857143,
4
+ "eval_DISORDER_FINDING_number": 5,
5
+ "eval_DISORDER_FINDING_precision": 0.06666666666666667,
6
+ "eval_DISORDER_FINDING_recall": 0.4,
7
+ "eval_loss": 1.2915407419204712,
8
+ "eval_overall_accuracy": 0.2222222222222222,
9
+ "eval_overall_f1": 0.1142857142857143,
10
+ "eval_overall_precision": 0.06666666666666667,
11
+ "eval_overall_recall": 0.4,
12
+ "eval_runtime": 0.1761,
13
+ "eval_samples": 3,
14
+ "eval_samples_per_second": 17.036,
15
+ "eval_steps_per_second": 5.679
16
+ }
model_info.json CHANGED
@@ -1,16 +1,16 @@
1
  {
2
- "model_version": 1765536980,
3
  "model_name": "tinybert-demo-en",
4
  "model_type": "tinybert",
5
  "model_platform": "pytorch",
6
  "model_architecture": "TinyBERT",
7
  "model_description": "Retrieve named entities from text.",
8
- "model_date": "2025-12-12T10:56:20.815542+00:00",
9
- "clinalytix_version": "25.12.0",
10
  "model_objective": "RECOGNITION",
11
  "use_case": "demo",
12
  "build_number": "10",
13
- "revision_number": "675d1d8cf4dfbc20c4faa7d9294265b9b1af2790",
14
  "language_code": "en",
15
  "language_codes_multilingual": null,
16
  "target": null,
 
1
  {
2
+ "model_version": 1774259037,
3
  "model_name": "tinybert-demo-en",
4
  "model_type": "tinybert",
5
  "model_platform": "pytorch",
6
  "model_architecture": "TinyBERT",
7
  "model_description": "Retrieve named entities from text.",
8
+ "model_date": "2026-03-23T09:43:57.190049+00:00",
9
+ "clinalytix_version": "26.03.0",
10
  "model_objective": "RECOGNITION",
11
  "use_case": "demo",
12
  "build_number": "10",
13
+ "revision_number": "7a69bd200ca16eb3f14e380484a5fb61afc70893",
14
  "language_code": "en",
15
  "language_codes_multilingual": null,
16
  "target": null,
runs/Mar23_09-15-49_ip-10-246-1-57/events.out.tfevents.1774257352.ip-10-246-1-57.34461.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f72e10771bb6a31ce91b397d88579da445bf3c1dde931fc84a23a9c0d54ad0cd
3
+ size 5946
runs/Mar23_09-15-49_ip-10-246-1-57/events.out.tfevents.1774257353.ip-10-246-1-57.34461.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e50e89b45af71e579f24cde84d9c37d30abca0a8c11d1af98e6e300003125425
3
+ size 846
tokenizer.json CHANGED
@@ -4,9 +4,18 @@
4
  "direction": "Right",
5
  "max_length": 256,
6
  "strategy": "LongestFirst",
7
- "stride": 16
 
 
 
 
 
 
 
 
 
 
8
  },
9
- "padding": null,
10
  "added_tokens": [
11
  {
12
  "id": 0,
 
4
  "direction": "Right",
5
  "max_length": 256,
6
  "strategy": "LongestFirst",
7
+ "stride": 0
8
+ },
9
+ "padding": {
10
+ "strategy": {
11
+ "Fixed": 256
12
+ },
13
+ "direction": "Right",
14
+ "pad_to_multiple_of": null,
15
+ "pad_id": 0,
16
+ "pad_type_id": 0,
17
+ "pad_token": "[PAD]"
18
  },
 
19
  "added_tokens": [
20
  {
21
  "id": 0,
train_results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.0,
3
+ "total_flos": 6768861120.0,
4
+ "train_loss": 0.6272501945495605,
5
+ "train_runtime": 0.752,
6
+ "train_samples": 15,
7
+ "train_samples_per_second": 19.947,
8
+ "train_steps_per_second": 1.33
9
+ }
trainer_state.json ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 1.0,
5
+ "eval_steps": 50000,
6
+ "global_step": 1,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 1.0,
13
+ "grad_norm": 3.052371025085449,
14
+ "learning_rate": 0.0,
15
+ "loss": 0.6273,
16
+ "step": 1
17
+ },
18
+ {
19
+ "epoch": 1.0,
20
+ "step": 1,
21
+ "total_flos": 6768861120.0,
22
+ "train_loss": 0.6272501945495605,
23
+ "train_runtime": 0.752,
24
+ "train_samples_per_second": 19.947,
25
+ "train_steps_per_second": 1.33
26
+ }
27
+ ],
28
+ "logging_steps": 10,
29
+ "max_steps": 1,
30
+ "num_input_tokens_seen": 0,
31
+ "num_train_epochs": 1,
32
+ "save_steps": 50000,
33
+ "stateful_callbacks": {
34
+ "TrainerControl": {
35
+ "args": {
36
+ "should_epoch_stop": false,
37
+ "should_evaluate": false,
38
+ "should_log": false,
39
+ "should_save": true,
40
+ "should_training_stop": true
41
+ },
42
+ "attributes": {}
43
+ }
44
+ },
45
+ "total_flos": 6768861120.0,
46
+ "train_batch_size": 32,
47
+ "trial_name": null,
48
+ "trial_params": null
49
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e6928829f04912d587f1e09fe2070d2f7f0a822bd021df29fa70f4cb8712e39b
3
+ size 5368