Training in progress, step 6

Browse files

Files changed (4) hide show

README.md +47 -149
model.safetensors +1 -1
runs/Oct24_07-52-45_ip-172-31-12-22/events.out.tfevents.1761292368.ip-172-31-12-22.19193.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,172 +1,70 @@
 ---
 license: other
 base_model: DedalusHealthCare/tinybert-mlm-de
 tags:
-- token-classification
-- ner
-- medical
-- demo
-- de
-- pytorch
-- transformers
-language:
-- de
-pipeline_tag: token-classification
-library_name: transformers
 model-index:
-- name: TinyBERT for Demo NER
-  results:
-  - task:
-      type: token-classification
-      name: Named Entity Recognition
-    dataset:
-      type: demo
-      name: Demo Dataset
-      config: de
-    metrics:
-    - type: f1
-      value: # Will be updated after evaluation
-      name: F1 Score
-    - type: precision
-      value: # Will be updated after evaluation
-      name: Precision
-    - type: recall
-      value: # Will be updated after evaluation
-      name: Recall
 ---
-# TinyBERT for Demo NER (DE)
-## Model Description
-This model is a fine-tuned TinyBERT model for Named Entity Recognition (NER) of DISORDER_FINDING entities in German medical texts.
-**Base Model**: DedalusHealthCare/tinybert-mlm-de
-**Language**: German (de)
-**Task**: Token Classification (NER)
-**Entities**: DISORDER_FINDING
-## Training Details
-### Training Dataset
-**Dataset**: `DedalusHealthCare/ner_demo_de@2025.10.21.12.36.59`
-The model was trained on a versioned dataset with timestamp-based versioning for reproducibility.
-### Training Configuration
-- **Training epochs**: 1
-- **Learning rate**: 5e-05
-- **Training batch size**: 32
-- **Evaluation batch size**: 32
-- **Max sequence length**: N/A
-- **Warmup steps**: 0
-- **Weight decay**: 0.01
-- **Gradient accumulation steps**: 2
-- **Mixed precision (FP16)**: False
-### Training Framework
-- **Framework**: PyTorch with HuggingFace Transformers
-- **Optimizer**: AdamW
-- **Scheduler**: Linear with warmup
-## Usage
-### Quick Start with Pipeline
-```python
-from transformers import pipeline
-# Initialize the NER pipeline
-ner_pipeline = pipeline(
-    "ner",
-    model="DedalusHealthCare/tinybert-demo-de",
-    tokenizer="DedalusHealthCare/tinybert-demo-de",
-    aggregation_strategy="simple"
-)
-# Example usage
-text = "Your medical text here"
-entities = ner_pipeline(text)
-print(entities)
-```
-### Advanced Usage
-```python
-from transformers import AutoTokenizer, AutoModelForTokenClassification
-import torch
-# Load model and tokenizer
-model_name = "DedalusHealthCare/tinybert-demo-de"
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-model = AutoModelForTokenClassification.from_pretrained(model_name)
-# Set model to evaluation mode
-model.eval()
-# Tokenize text
-text = "Your medical text here"
-inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
-# Get predictions
-with torch.no_grad():
-    outputs = model(**inputs)
-    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
-# Get predicted labels
-predicted_token_class_ids = predictions.argmax(-1)
-labels = [model.config.id2label[id.item()] for id in predicted_token_class_ids[0]]
-```
-## Model Performance
-Performance metrics will be updated after evaluation on the validation set.
-## Intended Use
-This model is specifically designed for:
-- Named Entity Recognition in German medical texts
-- Identification of DISORDER_FINDING entities
-- Medical document processing and analysis
-- Clinical NLP research and applications
-## Limitations
-- Trained specifically for German medical texts
-- Performance may vary on different medical domains or institutions
-- May require domain adaptation for optimal performance on new datasets
-- Subject to biases present in the training data
-## Ethical Considerations
-- This model processes medical data and should be used responsibly
-- All predictions should be validated by qualified medical professionals
-- Patient privacy and data protection regulations must be followed
-- The model may exhibit biases from the training data
-## Citation
-If you use this model, please cite:
-```bibtex
-@model{demo_de_ner_model,
-  title = {TinyBERT for Demo NER (DE)},
-  author = {DH Healthcare GmbH},
-  year = {2025},
-  publisher = {Hugging Face},
-  base_model = {DedalusHealthCare/tinybert-mlm-de},
-  url = {https://huggingface.co/DedalusHealthCare/tinybert-demo-de}
-}
-```
-## License
-This model is proprietary and owned by DH Healthcare GmbH. All rights reserved.
-## Contact
-For questions or support regarding this model, please contact DH Healthcare GmbH.

 ---
+library_name: transformers
+language:
+- multilingual
 license: other
 base_model: DedalusHealthCare/tinybert-mlm-de
 tags:
+- generated_from_trainer
+datasets:
+- ner_demo_de
 model-index:
+- name: tinybert-demo-de
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# tinybert-demo-de
+This model is a fine-tuned version of [DedalusHealthCare/tinybert-mlm-de](https://huggingface.co/DedalusHealthCare/tinybert-mlm-de) on the ner_demo_de dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4318
+- Disorder Finding Precision: 0.0
+- Disorder Finding Recall: 0.0
+- Disorder Finding F1: 0.0
+- Disorder Finding Number: 15
+- Overall Precision: 0.0
+- Overall Recall: 0.0
+- Overall F1: 0.0
+- Overall Accuracy: 0.8945
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 32
+- eval_batch_size: 32
+- seed: 33
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+### Training results
+### Framework versions
+- Transformers 4.45.1
+- Pytorch 2.6.0+cu124
+- Datasets 2.16.0
+- Tokenizers 0.20.3

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:070a810d9887cb2c43273330102bc49aa3501ad935fc008cc04b8df5b632f5f4
 size 48864824

 version https://git-lfs.github.com/spec/v1
+oid sha256:47add865fbb3e881df088c028919d01817b0a77ba664136d55ae456125fdd1ab
 size 48864824

runs/Oct24_07-52-45_ip-172-31-12-22/events.out.tfevents.1761292368.ip-172-31-12-22.19193.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:70c9bdeab5fbdb0db0c673c6f38c53408dd860fc94ae23a4e7202e1cde089b2d
+size 5897

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f2d518fb1ddd71c821d2aad57772aaa988abed1ff92c1d5b1f278f41dcd27ef1
 size 5368

 version https://git-lfs.github.com/spec/v1
+oid sha256:df6582a3f1f5fd6bfdd020303be653eab241de2d34b71eff9960d193b2a0c72f
 size 5368