RoBERTa-large fine-tuned on OntoNotes 5.0

This model is a fine-tuned version of FacebookAI/roberta-large on the English subset of the OntoNotes 5.0 (CoNLL-2012) dataset. RoBERTa-large features 24 layers and ~355M parameters, providing enhanced semantic understanding for complex Named Entity Recognition (NER) tasks compared to the base architecture.

πŸ“Š Performance

The following results were achieved on the OntoNotes 5.0 (v12) test set:

Entity Precision Recall F1-Score Support
CARDINAL 0.7769 0.7900 0.7834 1005
DATE 0.8211 0.8533 0.8369 1786
EVENT 0.5702 0.7647 0.6533 85
FAC 0.7123 0.6980 0.7051 149
GPE 0.9262 0.9470 0.9365 2546
LANGUAGE 0.7500 0.6818 0.7143 22
LAW 0.5000 0.6364 0.5600 44
LOC 0.6597 0.7302 0.6932 215
MONEY 0.8730 0.9099 0.8910 355
NORP 0.9029 0.9485 0.9251 990
ORDINAL 0.6936 0.7874 0.7376 207
ORG 0.8870 0.9101 0.8984 2002
PERCENT 0.8703 0.9066 0.8881 407
PERSON 0.9250 0.9246 0.9248 2134
PRODUCT 0.7356 0.7111 0.7232 90
QUANTITY 0.6933 0.6797 0.6865 153
TIME 0.6211 0.6267 0.6239 225
WORK_OF_ART 0.6686 0.6923 0.6802 169
micro avg 0.8581 0.8831 0.8704 12584
macro avg 0.7548 0.7888 0.7701 12584
weighted avg 0.8596 0.8831 0.8710 12584

πŸ›  Training Details

To optimize the 24-layer transformer on 2xNVIDIA V100 GPUs:

  • Architecture: RobertaForTokenClassification
  • Tokenizer: RobertaTokenizerFast (with add_prefix_space=True)
  • Learning Rate: 1e-5
  • Effective Batch Size: 32 (4 per device Γ— 4 gradient accumulation steps)
  • Epochs: 5
  • Warmup Ratio: 0.1
  • Mixed Precision: FP16 enabled
  • Optimizer: AdamW with weight_decay=0.01

πŸ“‚ Project Assets

Asset File Description
Model Weights model.safetensors Fine-tuned Large weights (~1.42 GB).
Configuration config.json 24-layer configuration and id2label map.
Vocabulary vocab.json / merges.txt BPE vocabulary and byte-level merge rules.
Tokenizer tokenizer.json / tokenizer_config.json Complete fast tokenizer setup.
Special Tokens special_tokens_map.json Definitions for BOS, EOS, and Padding tokens.
Training Args training_args.bin Hyperparameters used during the training run.

πŸš€ Usage

from transformers import pipeline

model_checkpoint = "learnrr/roberta-large-ontonotes5-ner"
token_classifier = pipeline(
    "token-classification", 
    model=model_checkpoint, 
    aggregation_strategy="simple"
)

text = "The United Nations is headquartered in New York City."
results = token_classifier(text)

for entity in results:
    print(f"Entity: {entity['word']} | Label: {entity['entity_group']} | Score: {entity['score']:.4f}")
Downloads last month
34
Safetensors
Model size
0.4B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for learnrr/roberta-large-ontonotes5-ner

Finetuned
(456)
this model

Dataset used to train learnrr/roberta-large-ontonotes5-ner