RoBERTa-large fine-tuned on OntoNotes 5.0

This model is a fine-tuned version of FacebookAI/roberta-large on the English subset of the OntoNotes 5.0 (CoNLL-2012) dataset. RoBERTa-large features 24 layers and ~355M parameters, providing enhanced semantic understanding for complex Named Entity Recognition (NER) tasks compared to the base architecture.

📊 Performance

The following results were achieved on the OntoNotes 5.0 (v12) test set:

Entity	Precision	Recall	F1-Score	Support
CARDINAL	0.7769	0.7900	0.7834	1005
DATE	0.8211	0.8533	0.8369	1786
EVENT	0.5702	0.7647	0.6533	85
FAC	0.7123	0.6980	0.7051	149
GPE	0.9262	0.9470	0.9365	2546
LANGUAGE	0.7500	0.6818	0.7143	22
LAW	0.5000	0.6364	0.5600	44
LOC	0.6597	0.7302	0.6932	215
MONEY	0.8730	0.9099	0.8910	355
NORP	0.9029	0.9485	0.9251	990
ORDINAL	0.6936	0.7874	0.7376	207
ORG	0.8870	0.9101	0.8984	2002
PERCENT	0.8703	0.9066	0.8881	407
PERSON	0.9250	0.9246	0.9248	2134
PRODUCT	0.7356	0.7111	0.7232	90
QUANTITY	0.6933	0.6797	0.6865	153
TIME	0.6211	0.6267	0.6239	225
WORK_OF_ART	0.6686	0.6923	0.6802	169
micro avg	0.8581	0.8831	0.8704	12584
macro avg	0.7548	0.7888	0.7701	12584
weighted avg	0.8596	0.8831	0.8710	12584

🛠 Training Details

To optimize the 24-layer transformer on 2xNVIDIA V100 GPUs:

Architecture: RobertaForTokenClassification
Tokenizer: RobertaTokenizerFast (with add_prefix_space=True)
Learning Rate: 1e-5
Effective Batch Size: 32 (4 per device × 4 gradient accumulation steps)
Epochs: 5
Warmup Ratio: 0.1
Mixed Precision: FP16 enabled
Optimizer: AdamW with weight_decay=0.01

📂 Project Assets

GitHub Repository: Learnrr/ontonotes5_ner_evaluation

Asset	File	Description
Model Weights	`model.safetensors`	Fine-tuned Large weights (~1.42 GB).
Configuration	`config.json`	24-layer configuration and `id2label` map.
Vocabulary	`vocab.json` / `merges.txt`	BPE vocabulary and byte-level merge rules.
Tokenizer	`tokenizer.json` / `tokenizer_config.json`	Complete fast tokenizer setup.
Special Tokens	`special_tokens_map.json`	Definitions for BOS, EOS, and Padding tokens.
Training Args	`training_args.bin`	Hyperparameters used during the training run.

🚀 Usage

from transformers import pipeline

model_checkpoint = "learnrr/roberta-large-ontonotes5-ner"
token_classifier = pipeline(
    "token-classification", 
    model=model_checkpoint, 
    aggregation_strategy="simple"
)

text = "The United Nations is headquartered in New York City."
results = token_classifier(text)

for entity in results:
    print(f"Entity: {entity['word']} | Label: {entity['entity_group']} | Score: {entity['score']:.4f}")

Downloads last month: 34

Safetensors

Model size

0.4B params

Tensor type

F32

Model tree for learnrr/roberta-large-ontonotes5-ner

Base model

FacebookAI/roberta-large

Finetuned

(456)

this model

learnrr
/

roberta-large-ontonotes5-ner