edloginovad commited on
Commit
5333b07
·
verified ·
1 Parent(s): cecbf37

Upload PyTorch model

Browse files
README.md CHANGED
@@ -1,245 +1,75 @@
1
  ---
2
- license: other
3
- base_model: DedalusHealthCare/tinybert-mlm-de
4
- datasets:
5
- - DedalusHealthCare/ner_demo_de
6
- task_categories:
7
- - token-classification
8
- task_ids:
9
- - named-entity-recognition
10
  language:
11
  - de
 
 
12
  tags:
13
- - token-classification
14
- - ner
15
- - named-entity-recognition
16
- - de
17
- - disorder_finding
18
- library_name: transformers
19
- pipeline_tag: token-classification
20
  ---
21
 
22
- # TinyBERT for Demo NER (German)
23
-
24
- ## Model Description
25
-
26
- This model is a fine-tuned TinyBERT model for Named Entity Recognition (NER) of DISORDER_FINDING entities in German medical texts.
27
-
28
- It was fine-tuned from the [DedalusHealthCare/tinybert-mlm-de](https://huggingface.co/DedalusHealthCare/tinybert-mlm-de) masked language model using the [DedalusHealthCare/ner_demo_de](https://huggingface.co/datasets/DedalusHealthCare/ner_demo_de) dataset.
29
-
30
- **Base Model**: [DedalusHealthCare/tinybert-mlm-de](https://huggingface.co/DedalusHealthCare/tinybert-mlm-de)
31
-
32
- **Training Dataset**: [DedalusHealthCare/ner_demo_de](https://huggingface.co/datasets/DedalusHealthCare/ner_demo_de)
33
-
34
- **Task**: Token Classification (Named Entity Recognition)
35
-
36
- **Language**: German (de)
37
-
38
- **Entities**: DISORDER_FINDING
39
-
40
- **Model Format**: PYTORCH+ONNX
41
-
42
- **Please use `max` as aggregation strategy in the NER pipeline (see example below)**.
43
-
44
- ## Training Details
45
-
46
- - **Training epochs**: 1
47
- - **Learning rate**: N/A
48
- - **Training batch size**: 32
49
- - **Evaluation batch size**: 32
50
- - **Max sequence length**: 256
51
- - **Warmup steps**: N/A
52
- - **FP16**: False
53
- - **Gradient accumulation steps**: 2
54
- - **Evaluation accumulation steps**: 2
55
- - **Save steps**: 15000
56
- - **Evaluation steps**: 10000
57
- - **Evaluation strategy**: steps
58
- - **Random seed**: 33
59
- - **Label all tokens**: True
60
- - **Balanced training**: False
61
- - **Chunk mode**: sliding_window
62
- - **Stride**: 16
63
- - **Max training samples**: None
64
- - **Max evaluation samples**: 10000
65
- - **Early stopping patience**: 0
66
- - **Early stopping threshold**: 0.0
67
-
68
- ## Use Case Configuration
69
-
70
- - **Use case name**: demo
71
- - **Language**: German (de)
72
- - **Target entities**: DISORDER_FINDING
73
- - **Text processing max length**: N/A
74
- - **Entity labeling scheme**: N/A
75
-
76
- ## Usage
77
-
78
- ### Using Transformers Pipeline
79
-
80
- ```python
81
- from transformers import pipeline
82
-
83
- # Load the model
84
- ner_pipeline = pipeline(
85
- "ner",
86
- model="DedalusHealthCare/tinybert-ner-demo-de",
87
- tokenizer="DedalusHealthCare/tinybert-ner-demo-de",
88
- aggregation_strategy="max"
89
- )
90
-
91
- # Example text
92
- text = "Der Patient hat Diabetes und Bluthochdruck."
93
-
94
- # Get predictions
95
- entities = ner_pipeline(text)
96
- print(entities)
97
- ```
98
-
99
- ### Using AutoModel and AutoTokenizer
100
-
101
- ```python
102
- from transformers import AutoTokenizer, AutoModelForTokenClassification
103
- import torch
104
-
105
- # Load model and tokenizer
106
- model_name = "DedalusHealthCare/tinybert-ner-demo-de"
107
- tokenizer = AutoTokenizer.from_pretrained(model_name)
108
- model = AutoModelForTokenClassification.from_pretrained(model_name)
109
-
110
- # Tokenize text
111
- text = "Der Patient hat Diabetes und Bluthochdruck."
112
- tokens = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
113
-
114
- # Get predictions
115
- with torch.no_grad():
116
- outputs = model(**tokens)
117
- predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
118
-
119
- # Get labels
120
- predicted_token_class_ids = predictions.argmax(-1)
121
- labels = [model.config.id2label[id.item()] for id in predicted_token_class_ids[0]]
122
- ```
123
-
124
- ### Using ONNX Runtime (Optimized Inference)
125
-
126
- ```python
127
- from optimum.onnxruntime import ORTModelForTokenClassification
128
- from transformers import AutoTokenizer, pipeline
129
- import torch
130
-
131
- # Load ONNX model for faster inference
132
- model_name = "DedalusHealthCare/tinybert-ner-demo-de"
133
- onnx_model = ORTModelForTokenClassification.from_pretrained(model_name)
134
- tokenizer = AutoTokenizer.from_pretrained(model_name)
135
-
136
- # Create pipeline with ONNX model (recommended)
137
- ner_pipeline = pipeline(
138
- "ner",
139
- model=onnx_model,
140
- tokenizer=tokenizer,
141
- aggregation_strategy="max"
142
- )
143
-
144
- # Example text
145
- text = "Der Patient hat Diabetes und Bluthochdruck."
146
- entities = ner_pipeline(text)
147
- print(entities)
148
-
149
- # Direct model usage
150
- inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
151
- with torch.no_grad():
152
- outputs = onnx_model(**inputs)
153
- predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
154
-
155
- predicted_token_class_ids = predictions.argmax(-1)
156
- token_labels = [onnx_model.config.id2label[id.item()] for id in predicted_token_class_ids[0]]
157
- ```
158
-
159
- ### Performance Comparison
160
-
161
- - **PyTorch**: Standard format, suitable for training and research
162
- - **ONNX**: Optimized for inference, typically 2-4x faster than PyTorch
163
- - **Recommendation**: Use ONNX for production inference, PyTorch for research
164
-
165
- ## Model Architecture
166
-
167
- This model is based on the TinyBERT architecture with a token classification head for Named Entity Recognition.
168
-
169
- ## Intended Use
170
-
171
- This model is intended for:
172
- - Named Entity Recognition in German medical texts
173
- - Identification of DISORDER_FINDING entities
174
- - Medical text processing and analysis
175
- - Research and development in medical NLP
176
-
177
- ## Limitations
178
-
179
- - Trained specifically for German medical texts
180
- - Performance may vary on texts from different medical domains
181
- - May not generalize well to non-medical texts
182
- - Requires careful evaluation on new datasets
183
-
184
- ## Ethical Considerations
185
-
186
- - This model is trained on medical data and should be used responsibly
187
- - Outputs should be validated by medical professionals
188
- - Patient privacy and data protection regulations must be followed
189
- - The model may have biases present in the training data
190
-
191
 
192
- ## Model Performance
193
 
194
- This model has been evaluated on the **goldset from ner_disorderfinding_de_goldset** using
195
- IO evaluation (sklearn, token level, lenient) with the following results:
 
 
 
 
 
 
 
 
 
 
 
 
 
196
 
197
- ### Overall Performance
198
 
199
- | Metric | Score |
200
- |--------|-------|
201
- | Precision (Macro) | 0.423825 |
202
- | Recall (Macro) | 0.467183 |
203
- | F1-Score (Macro) | 0.435170 |
204
- | Precision (Weighted) | 0.599471 |
205
- | Recall (Weighted) | 0.697989 |
206
- | F1-Score (Weighted) | 0.640426 |
207
 
208
- **Inference Performance**: 5.53 seconds for evaluation dataset
209
 
210
- ### Entity-Level Performance (IO Evaluation)
211
 
212
- | Entity Type | Precision | Recall | F1-Score | Support |
213
- |-------------|-----------|--------|----------|---------|
214
- | DISORDER_FINDING | 0.753533 | 0.900434 | 0.820460 | N/A |
215
 
216
- ### Evaluation Details
217
 
218
- - **Dataset**: goldset from ner_disorderfinding_de_goldset
219
- - **Dataset Source**: goldset
220
- - **Evaluation Date**: 2025-11-03 12:25:56
221
- - **Language**: de
222
- - **Entities**: DISORDER_FINDING
223
 
224
- *This evaluation section is automatically generated and updated.*
225
- ## Citation
226
 
227
- If you use this model, please cite:
 
 
 
 
 
 
 
 
 
 
 
228
 
229
- ```bibtex
230
- @model{demo_de_ner_model,
231
- title = {TinyBERT for Demo NER (German)},
232
- author = {DH Healthcare GmbH},
233
- year = {2025},
234
- publisher = {Hugging Face},
235
- url = {https://huggingface.co/DedalusHealthCare/tinybert-ner-demo-de}
236
- }
237
- ```
238
 
239
- ## License
240
 
241
- This model is proprietary and owned by DH Healthcare GmbH. All rights reserved.
242
 
243
- ## Contact
244
 
245
- For questions or support, please contact DH Healthcare GmbH.
 
 
 
 
1
  ---
2
+ library_name: transformers
 
 
 
 
 
 
 
3
  language:
4
  - de
5
+ license: other
6
+ base_model: DedalusHealthCare/tinybert-mlm-de
7
  tags:
8
+ - generated_from_trainer
9
+ datasets:
10
+ - demo-de
11
+ model-index:
12
+ - name: tinybert-clinalytix_1774355441
13
+ results: []
 
14
  ---
15
 
16
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
+ should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
+ # tinybert-clinalytix_1774355441
20
 
21
+ This model is a fine-tuned version of [DedalusHealthCare/tinybert-mlm-de](https://huggingface.co/DedalusHealthCare/tinybert-mlm-de) on the demo-de dataset.
22
+ It achieves the following results on the evaluation set:
23
+ - Loss: 1.7065
24
+ - Disorder Precision: 0.0
25
+ - Disorder Recall: 0.0
26
+ - Disorder F1: 0.0
27
+ - Disorder Number: 2
28
+ - Finding Precision: 0.0
29
+ - Finding Recall: 0.0
30
+ - Finding F1: 0.0
31
+ - Finding Number: 0
32
+ - Overall Precision: 0.0
33
+ - Overall Recall: 0.0
34
+ - Overall F1: 0.0
35
+ - Overall Accuracy: 0.1176
36
 
37
+ ## Model description
38
 
39
+ More information needed
 
 
 
 
 
 
 
40
 
41
+ ## Intended uses & limitations
42
 
43
+ More information needed
44
 
45
+ ## Training and evaluation data
 
 
46
 
47
+ More information needed
48
 
49
+ ## Training procedure
 
 
 
 
50
 
51
+ ### Training hyperparameters
 
52
 
53
+ The following hyperparameters were used during training:
54
+ - learning_rate: 5e-05
55
+ - train_batch_size: 32
56
+ - eval_batch_size: 32
57
+ - seed: 33
58
+ - gradient_accumulation_steps: 2
59
+ - total_train_batch_size: 64
60
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
61
+ - lr_scheduler_type: linear
62
+ - lr_scheduler_warmup_ratio: 0.1
63
+ - num_epochs: 1
64
+ - label_smoothing_factor: 0.1
65
 
66
+ ### Training results
 
 
 
 
 
 
 
 
67
 
 
68
 
 
69
 
70
+ ### Framework versions
71
 
72
+ - Transformers 4.45.1
73
+ - Pytorch 2.10.0+cu128
74
+ - Datasets 4.5.0
75
+ - Tokenizers 0.20.3
all_results.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.0,
3
+ "eval_DISORDER_f1": 0.0,
4
+ "eval_DISORDER_number": 2,
5
+ "eval_DISORDER_precision": 0.0,
6
+ "eval_DISORDER_recall": 0.0,
7
+ "eval_FINDING_f1": 0.0,
8
+ "eval_FINDING_number": 0,
9
+ "eval_FINDING_precision": 0.0,
10
+ "eval_FINDING_recall": 0.0,
11
+ "eval_loss": 1.7064510583877563,
12
+ "eval_overall_accuracy": 0.11764705882352941,
13
+ "eval_overall_f1": 0.0,
14
+ "eval_overall_precision": 0.0,
15
+ "eval_overall_recall": 0.0,
16
+ "eval_runtime": 0.1739,
17
+ "eval_samples": 3,
18
+ "eval_samples_per_second": 17.256,
19
+ "eval_steps_per_second": 5.752,
20
+ "total_flos": 21821285850.0,
21
+ "train_loss": 0.8611775636672974,
22
+ "train_runtime": 1.1366,
23
+ "train_samples": 17,
24
+ "train_samples_per_second": 14.957,
25
+ "train_steps_per_second": 0.88
26
+ }
checkpoint-1/config.json ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "DedalusHealthCare/tinybert-mlm-de",
3
+ "architectures": [
4
+ "BertForTokenClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "finetuning_task": "ner",
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 312,
12
+ "id2label": {
13
+ "0": "B-DISORDER",
14
+ "1": "B-FINDING",
15
+ "2": "I-DISORDER",
16
+ "3": "I-FINDING",
17
+ "4": "O"
18
+ },
19
+ "initializer_range": 0.02,
20
+ "intermediate_size": 312,
21
+ "label2id": {
22
+ "B-DISORDER": 0,
23
+ "B-FINDING": 1,
24
+ "I-DISORDER": 2,
25
+ "I-FINDING": 3,
26
+ "O": 4
27
+ },
28
+ "layer_norm_eps": 1e-12,
29
+ "max_position_embeddings": 512,
30
+ "model_type": "bert",
31
+ "num_attention_heads": 12,
32
+ "num_hidden_layers": 4,
33
+ "pad_token_id": 0,
34
+ "position_embedding_type": "absolute",
35
+ "pre_trained": "",
36
+ "torch_dtype": "float32",
37
+ "training": "",
38
+ "transformers_version": "4.45.1",
39
+ "type_vocab_size": 2,
40
+ "use_cache": true,
41
+ "vocab_size": 31102
42
+ }
checkpoint-1/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:70bb4470bd4149e9aa0d4ab934e0e4ffc533b8af1702686320f4b3d57f8a062e
3
+ size 48868580
checkpoint-1/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:016a14a774433cfd877144dd293cac3252b81b21b3ddd226eabd54af0507dc28
3
+ size 97776331
checkpoint-1/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cf691b43f93b3451c34523702ad48fab1acaf3207fba0e9c194fa3aedeb5ec8c
3
+ size 14455
checkpoint-1/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e7a808092c9737ce7476966a087defa251dbafd354d39128871d37cbc6fa6c4
3
+ size 1465
checkpoint-1/special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
checkpoint-1/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
checkpoint-1/tokenizer_config.json ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "101": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "102": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "103": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": true,
47
+ "mask_token": "[MASK]",
48
+ "max_length": 256,
49
+ "model_max_length": 1000000000000000019884624838656,
50
+ "pad_to_multiple_of": null,
51
+ "pad_token": "[PAD]",
52
+ "pad_token_type_id": 0,
53
+ "padding_side": "right",
54
+ "sep_token": "[SEP]",
55
+ "stride": 0,
56
+ "strip_accents": true,
57
+ "tokenize_chinese_chars": true,
58
+ "tokenizer_class": "BertTokenizer",
59
+ "truncation_side": "right",
60
+ "truncation_strategy": "longest_first",
61
+ "unk_token": "[UNK]"
62
+ }
checkpoint-1/trainer_state.json ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 1.0,
5
+ "eval_steps": 10000,
6
+ "global_step": 1,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 1.0,
13
+ "grad_norm": 4.18489933013916,
14
+ "learning_rate": 0.0,
15
+ "loss": 0.8612,
16
+ "step": 1
17
+ }
18
+ ],
19
+ "logging_steps": 10,
20
+ "max_steps": 1,
21
+ "num_input_tokens_seen": 0,
22
+ "num_train_epochs": 1,
23
+ "save_steps": 15000,
24
+ "stateful_callbacks": {
25
+ "TrainerControl": {
26
+ "args": {
27
+ "should_epoch_stop": false,
28
+ "should_evaluate": false,
29
+ "should_log": false,
30
+ "should_save": true,
31
+ "should_training_stop": true
32
+ },
33
+ "attributes": {}
34
+ }
35
+ },
36
+ "total_flos": 21821285850.0,
37
+ "train_batch_size": 32,
38
+ "trial_name": null,
39
+ "trial_params": null
40
+ }
checkpoint-1/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:253c832ed6cc1e1c6e4407d924248dfb9827691f8ff7f7c2cbbb03eb7f1936d7
3
+ size 5841
checkpoint-1/vocab.txt ADDED
The diff for this file is too large to render. See raw diff
 
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "/workspaces/prod/nlp/nlp-tools/data/ner_demo_de/models/tinybert-clinalytix",
3
  "architectures": [
4
  "BertForTokenClassification"
5
  ],
@@ -10,14 +10,20 @@
10
  "hidden_dropout_prob": 0.1,
11
  "hidden_size": 312,
12
  "id2label": {
13
- "0": "B-DISORDER_FINDING",
14
- "1": "O"
 
 
 
15
  },
16
  "initializer_range": 0.02,
17
  "intermediate_size": 312,
18
  "label2id": {
19
- "B-DISORDER_FINDING": 0,
20
- "O": 1
 
 
 
21
  },
22
  "layer_norm_eps": 1e-12,
23
  "max_position_embeddings": 512,
@@ -27,6 +33,7 @@
27
  "pad_token_id": 0,
28
  "position_embedding_type": "absolute",
29
  "pre_trained": "",
 
30
  "training": "",
31
  "transformers_version": "4.45.1",
32
  "type_vocab_size": 2,
 
1
  {
2
+ "_name_or_path": "DedalusHealthCare/tinybert-mlm-de",
3
  "architectures": [
4
  "BertForTokenClassification"
5
  ],
 
10
  "hidden_dropout_prob": 0.1,
11
  "hidden_size": 312,
12
  "id2label": {
13
+ "0": "B-DISORDER",
14
+ "1": "B-FINDING",
15
+ "2": "I-DISORDER",
16
+ "3": "I-FINDING",
17
+ "4": "O"
18
  },
19
  "initializer_range": 0.02,
20
  "intermediate_size": 312,
21
  "label2id": {
22
+ "B-DISORDER": 0,
23
+ "B-FINDING": 1,
24
+ "I-DISORDER": 2,
25
+ "I-FINDING": 3,
26
+ "O": 4
27
  },
28
  "layer_norm_eps": 1e-12,
29
  "max_position_embeddings": 512,
 
33
  "pad_token_id": 0,
34
  "position_embedding_type": "absolute",
35
  "pre_trained": "",
36
+ "torch_dtype": "float32",
37
  "training": "",
38
  "transformers_version": "4.45.1",
39
  "type_vocab_size": 2,
eval_results.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.0,
3
+ "eval_DISORDER_f1": 0.0,
4
+ "eval_DISORDER_number": 2,
5
+ "eval_DISORDER_precision": 0.0,
6
+ "eval_DISORDER_recall": 0.0,
7
+ "eval_FINDING_f1": 0.0,
8
+ "eval_FINDING_number": 0,
9
+ "eval_FINDING_precision": 0.0,
10
+ "eval_FINDING_recall": 0.0,
11
+ "eval_loss": 1.7064510583877563,
12
+ "eval_overall_accuracy": 0.11764705882352941,
13
+ "eval_overall_f1": 0.0,
14
+ "eval_overall_precision": 0.0,
15
+ "eval_overall_recall": 0.0,
16
+ "eval_runtime": 0.1739,
17
+ "eval_samples": 3,
18
+ "eval_samples_per_second": 17.256,
19
+ "eval_steps_per_second": 5.752
20
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8a637edcc6d6ebfd24fdd2df56ffec67e4170ffc3a5a07a186b712b100a03f9f
3
- size 48864824
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:70bb4470bd4149e9aa0d4ab934e0e4ffc533b8af1702686320f4b3d57f8a062e
3
+ size 48868580
model_info.json CHANGED
@@ -1,16 +1,16 @@
1
  {
2
- "model_version": 1768930002,
3
- "model_name": "bert-demo-de",
4
- "model_type": "bert",
5
  "model_platform": "pytorch",
6
- "model_architecture": "BERT",
7
  "model_description": "Retrieve named entities from text.",
8
- "model_date": "2026-01-20T18:26:42.445432+01:00",
9
- "clinalytix_version": "unknown",
10
  "model_objective": "RECOGNITION",
11
  "use_case": "demo",
12
- "build_number": null,
13
- "revision_number": null,
14
  "language_code": "de",
15
  "language_codes_multilingual": null,
16
  "target": null,
 
1
  {
2
+ "model_version": 1774364768,
3
+ "model_name": "tinybert-demo-de",
4
+ "model_type": "tinybert",
5
  "model_platform": "pytorch",
6
+ "model_architecture": "TinyBERT",
7
  "model_description": "Retrieve named entities from text.",
8
+ "model_date": "2026-03-24T15:06:08.594093+00:00",
9
+ "clinalytix_version": "26.03.0",
10
  "model_objective": "RECOGNITION",
11
  "use_case": "demo",
12
+ "build_number": "10",
13
+ "revision_number": "7a69bd200ca16eb3f14e380484a5fb61afc70893",
14
  "language_code": "de",
15
  "language_codes_multilingual": null,
16
  "target": null,
runs/Mar24_12-30-41_ip-10-246-1-57/events.out.tfevents.1774355449.ip-10-246-1-57.121762.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9da6d43d0ed82cd545e671a6d07a7b8dded3014c3cf749d1944564cfab90944f
3
+ size 6072
runs/Mar24_12-30-41_ip-10-246-1-57/events.out.tfevents.1774355450.ip-10-246-1-57.121762.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:13b01f02d5d3b877434f77fe121a102a0f2e3a814abbed9e98729abb747f0b8b
3
+ size 1041
train_results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.0,
3
+ "total_flos": 21821285850.0,
4
+ "train_loss": 0.8611775636672974,
5
+ "train_runtime": 1.1366,
6
+ "train_samples": 17,
7
+ "train_samples_per_second": 14.957,
8
+ "train_steps_per_second": 0.88
9
+ }
trainer_state.json ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 1.0,
5
+ "eval_steps": 10000,
6
+ "global_step": 1,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 1.0,
13
+ "grad_norm": 4.18489933013916,
14
+ "learning_rate": 0.0,
15
+ "loss": 0.8612,
16
+ "step": 1
17
+ },
18
+ {
19
+ "epoch": 1.0,
20
+ "step": 1,
21
+ "total_flos": 21821285850.0,
22
+ "train_loss": 0.8611775636672974,
23
+ "train_runtime": 1.1366,
24
+ "train_samples_per_second": 14.957,
25
+ "train_steps_per_second": 0.88
26
+ }
27
+ ],
28
+ "logging_steps": 10,
29
+ "max_steps": 1,
30
+ "num_input_tokens_seen": 0,
31
+ "num_train_epochs": 1,
32
+ "save_steps": 15000,
33
+ "stateful_callbacks": {
34
+ "TrainerControl": {
35
+ "args": {
36
+ "should_epoch_stop": false,
37
+ "should_evaluate": false,
38
+ "should_log": false,
39
+ "should_save": true,
40
+ "should_training_stop": true
41
+ },
42
+ "attributes": {}
43
+ }
44
+ },
45
+ "total_flos": 21821285850.0,
46
+ "train_batch_size": 32,
47
+ "trial_name": null,
48
+ "trial_params": null
49
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:253c832ed6cc1e1c6e4407d924248dfb9827691f8ff7f7c2cbbb03eb7f1936d7
3
+ size 5841