Multi-label Token Classification for NER
Collection
A collection of BERT models adapted for multi-label token classification, fine tuned on multiple NER datasets. • 7 items • Updated
jvaquet/multilabel-classification-bert.Using the NER pipeline is rahter simple:
from transformers import pipeline
pipe = pipeline(model='jvaquet/multilabel-classification-bert-ontonotes5',
stride=128,
threshold=0.5,
use_hierarchy_heuristic=False,
trust_remote_code=True)
entities = pipe(my_text)
The parameters are:
stride - int: Stride for the tokenizer. When the text length exceeds tokenizer.model_max_length, it splits the input accordingly with the specified stride.threshold - float: Threshold for entitiy detection. Sigmoid of the logits.use_hierarchy_heuristic - bool: Apply heuristic to suppress additional entities when entities of same class overlap hierarchically.Base model
google-bert/bert-large-cased