edloginovad's picture
Update model card with info
42bfbab verified
metadata
license: other
base_model: DedalusHealthCare/tinybert-mlm-en
datasets:
  - DedalusHealthCare/ner_demo_en
task_categories:
  - token-classification
task_ids:
  - named-entity-recognition
language:
  - en
tags:
  - token-classification
  - ner
  - named-entity-recognition
  - en
  - disorder_finding
library_name: transformers
pipeline_tag: token-classification

TinyBERT for Demo NER (English)

Model Description

This model is a fine-tuned TinyBERT model for Named Entity Recognition (NER) of DISORDER_FINDING entities in English medical texts.

It was fine-tuned from the DedalusHealthCare/tinybert-mlm-en masked language model using the DedalusHealthCare/ner_demo_en dataset.

Base Model: DedalusHealthCare/tinybert-mlm-en

Training Dataset: DedalusHealthCare/ner_demo_en

Task: Token Classification (Named Entity Recognition)

Language: English (en)

Entities: DISORDER_FINDING

Model Format: PYTORCH

Please use max as aggregation strategy in the NER pipeline (see example below).

Training Details

  • Training epochs: 1
  • Learning rate: 5e-05
  • Training batch size: 32
  • Evaluation batch size: 32
  • Max sequence length: 256
  • Warmup ratio: 0.1
  • Weight decay: 0.01
  • FP16: True
  • Gradient accumulation steps: 2
  • Save steps: 50000
  • Evaluation steps: 50000
  • Evaluation strategy: steps
  • Random seed: 1
  • Label all tokens: True
  • Balanced training: False
  • Chunk mode: sliding_window
  • Stride: 16
  • Max training samples: None
  • Max evaluation samples: None
  • Early stopping patience: 0
  • Early stopping threshold: 0.0

Build Information

Use Case Configuration

  • Use case name: demo
  • Language: English (en)
  • Target entities: DISORDER_FINDING
  • Text processing max length: N/A
  • Entity labeling scheme: N/A

Usage

Using Transformers Pipeline

from transformers import pipeline

# Load the model
ner_pipeline = pipeline(
    "ner",
    model="DedalusHealthCare/tinybert-ner-demo-en",
    tokenizer="DedalusHealthCare/tinybert-ner-demo-en",
    aggregation_strategy="max"
)

# Example text
text = "Der Patient hat Diabetes und Bluthochdruck."

# Get predictions
entities = ner_pipeline(text)
print(entities)

Using AutoModel and AutoTokenizer

from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch

# Load model and tokenizer
model_name = "DedalusHealthCare/tinybert-ner-demo-en"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)

# Tokenize text
text = "Der Patient hat Diabetes und Bluthochdruck."
tokens = tokenizer(text, return_tensors="pt", truncation=True, padding=True)

# Get predictions
with torch.no_grad():
    outputs = model(**tokens)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)

# Get labels
predicted_token_class_ids = predictions.argmax(-1)
labels = [model.config.id2label[id.item()] for id in predicted_token_class_ids[0]]

Model Architecture

This model is based on the TinyBERT architecture with a token classification head for Named Entity Recognition.

Intended Use

This model is intended for:

  • Named Entity Recognition in English medical texts
  • Identification of DISORDER_FINDING entities
  • Medical text processing and analysis
  • Research and development in medical NLP

Limitations

  • Trained specifically for English medical texts
  • Performance may vary on texts from different medical domains
  • May not generalize well to non-medical texts
  • Requires careful evaluation on new datasets

Ethical Considerations

  • This model is trained on medical data and should be used responsibly
  • Outputs should be validated by medical professionals
  • Patient privacy and data protection regulations must be followed
  • The model may have biases present in the training data

Citation

If you use this model, please cite:

@model{demo_en_ner_model,
  title = {TinyBERT for Demo NER (English)},
  author = {DH Healthcare GmbH},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/DedalusHealthCare/tinybert-ner-demo-en}
}

License

This model is proprietary and owned by DH Healthcare GmbH. All rights reserved.

Contact

For questions or support, please contact DH Healthcare GmbH.