Bert_NER_Ausa

A lightweight token-classification (NER) model that extracts structured entities from short health-assistant utterances โ€” appointments, symptoms, allergies, routines/medication, and profile details. It is the entity-extraction stage of the AUSA Hub voice/text assistant, paired with the SetFit intent router aadiausa/Set_Fit_Ausa.

Model details

  • Architecture: BertForTokenClassification (TinyBERT โ€” 4 hidden layers, hidden size 312, 12 attention heads, ~14.5M parameters)
  • Tokenizer: WordPiece, uncased (bert-base-uncased vocab, 30522 tokens)
  • Max sequence length: 512 tokens
  • Tagging scheme: BIO โ€” 55 labels = O + B-/I- for 27 entity types

Entity types (27)

Domain Entities
Symptoms & allergies ALLERGY, SYMPTOM, SEVERITY, ONSET, DOSAGE
Scheduling PROVIDER, DATE, START_TIME, END_TIME, ROUTINE, FREQUENCY, SCHEDULED_TIME, START_DATE, END_DATE, INTERVAL, DURATION, DAY_OF_WEEK
Profile / contacts FULL_NAME, EMAIL, PHONE, ADDRESS, GENDER, HEIGHT, WEIGHT, RELATION, INVITE_METHOD, PERMISSION

Usage

from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline

model_id = "aadiausa/Bert_NER_Ausa"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForTokenClassification.from_pretrained(model_id)

ner = pipeline("token-classification", model=model, tokenizer=tok,
               aggregation_strategy="simple")

ner("book an appointment with Dr. Patel next Monday at 3pm")
# -> [{'entity_group': 'PROVIDER', 'word': 'dr. patel', ...},
#     {'entity_group': 'DATE', 'word': 'next monday', ...},
#     {'entity_group': 'START_TIME', 'word': '3pm', ...}]

Intended use & limitations

  • Designed for short, first-person health-assistant commands in English. Performance on long-form clinical notes or other domains is not guaranteed.
  • The model detects spans that may correspond to personal data (FULL_NAME, EMAIL, PHONE, ADDRESS); it does not validate, store, or de-identify them โ€” downstream handling is the integrator's responsibility.
  • Not a medical device and not intended for diagnosis or treatment decisions.
Downloads last month
6
Safetensors
Model size
14.3M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support