BiLSTM-CRF for NER (OntoNotes 5.0)

This is a custom BiLSTM-CRF model fine-tuned on the English subset of the OntoNotes 5.0 (CoNLL-2012) dataset. Unlike Transformer-based models, this architecture combines the sequential feature extraction of BiLSTMs with the structural inference of a Conditional Random Field (CRF) layer, initialized with pre-trained GloVe 300d word embeddings.

📊 Performance

The model was evaluated on the OntoNotes 5.0 (v12) official test set using seqeval:

Entity	Precision	Recall	F1-Score	Support
CARDINAL	0.7310	0.7572	0.7439	1005
DATE	0.7970	0.8309	0.8136	1786
EVENT	0.6180	0.6471	0.6322	85
FAC	0.5678	0.4497	0.5019	149
GPE	0.8621	0.8818	0.8718	2546
LOC	0.6491	0.6884	0.6682	215
MONEY	0.8575	0.8648	0.8612	355
NORP	0.8734	0.8778	0.8756	990
ORG	0.8195	0.8232	0.8213	2002
PERSON	0.8707	0.8454	0.8578	2134
micro avg	0.8099	0.8201	0.8150	12585
macro avg	0.7040	0.7073	0.7046	12585
weighted avg	0.8103	0.8201	0.8148	12585

🛠 Model Architecture

Embedding Layer: GloVe 300d (wiki-gigaword), fine-tuned during training.
Encoder: 2-layer Bi-directional LSTM with 512 hidden units.
Decoder: Linear-chain CRF for optimal tag sequence decoding.
Dropout: 0.5 (Applied to embeddings and LSTM outputs).

📂 Project Assets

GitHub Repository: Learnrr/ontonotes5_ner_evaluation

Asset	File	Description
Model Weights	`bilstm_crf_model.bin`	PyTorch state dictionary (~85.8 MB).
Vocabulary	`vocab.pth`	Pickled word-to-index mapping.
Label List	`label_list.pth`	Pickled NER tag list (BIO format).
Documentation	`README.md`	Model card and usage instructions.

📂 Training Infrastructure

Framework: PyTorch with DistributedDataParallel (DDP).
Hardware: Multi-GPU (NVIDIA V100) setup with NCCL backend.
Hyperparameters:
- Optimizer: AdamW (lr=1e-3, weight_decay=0.01)
- Scheduler: Linear warmup with decay (warmup_ratio=0.1)
- Epochs: 20
- Batch Size: 32 per GPU (Effective batch size 64)
- Max Length: 128 tokens

🚀 Usage

import torch
from model import BiLSTM_CRF # Ensure class definition is accessible

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 1. Load mappings
vocab = torch.load("vocab.pth")
label_list = torch.load("label_list.pth")

# 2. Initialize and Load Weights
model = BiLSTM_CRF(
    v_size=len(vocab), 
    t_size=len(label_list), 
    e_dim=300, 
    h_dim=512, 
    w_matrix=torch.zeros(len(vocab), 300)
)

state_dict = torch.load("best_bilstm_crf_ddp.pth", map_location=device)
# Standardize keys (remove 'module.' from DDP training)
new_state_dict = {k.replace('module.', ''): v for k, v in state_dict.items()}
model.load_state_dict(new_state_dict)
model.to(device).eval()

Downloads last month: -; Downloads are not tracked for this model. How to track

learnrr
/

bilstm-crf-ontonotes5-ner