DedalusHealthCare
/

tinybert-demo-de

@@ -37,7 +37,7 @@ It was fine-tuned from the [DedalusHealthCare/tinybert-mlm-de](https://huggingfa
 **Entities**: DISORDER_FINDING
-**Model Format**: PYTORCH
 **Please use `max` as aggregation strategy in the NER pipeline (see example below)**.
@@ -121,6 +121,47 @@ predicted_token_class_ids = predictions.argmax(-1)
 labels = [model.config.id2label[id.item()] for id in predicted_token_class_ids[0]]
 ```
 ## Model Architecture
 This model is based on the TinyBERT architecture with a token classification head for Named Entity Recognition.

 **Entities**: DISORDER_FINDING
+**Model Format**: PYTORCH+ONNX
 **Please use `max` as aggregation strategy in the NER pipeline (see example below)**.
 labels = [model.config.id2label[id.item()] for id in predicted_token_class_ids[0]]
 ```
+### Using ONNX Runtime (Optimized Inference)
+```python
+from optimum.onnxruntime import ORTModelForTokenClassification
+from transformers import AutoTokenizer, pipeline
+import torch
+# Load ONNX model for faster inference
+model_name = "DedalusHealthCare/tinybert-demo-de"
+onnx_model = ORTModelForTokenClassification.from_pretrained(model_name)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+# Create pipeline with ONNX model (recommended)
+ner_pipeline = pipeline(
+    "ner",
+    model=onnx_model,
+    tokenizer=tokenizer,
+    aggregation_strategy="max"
+)
+# Example text
+text = "Der Patient hat Diabetes und Bluthochdruck."
+entities = ner_pipeline(text)
+print(entities)
+# Direct model usage
+inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
+with torch.no_grad():
+    outputs = onnx_model(**inputs)
+    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
+predicted_token_class_ids = predictions.argmax(-1)
+token_labels = [onnx_model.config.id2label[id.item()] for id in predicted_token_class_ids[0]]
+```
+### Performance Comparison
+- **PyTorch**: Standard format, suitable for training and research
+- **ONNX**: Optimized for inference, typically 2-4x faster than PyTorch
+- **Recommendation**: Use ONNX for production inference, PyTorch for research
 ## Model Architecture
 This model is based on the TinyBERT architecture with a token classification head for Named Entity Recognition.