--- library_name: transformers tags: - multilabel - multilabel-token-classification base_model: - jvaquet/multilabel-classification-bert pipeline_tag: token-classification --- # Overview - This is a BERT-based **multi-label token classification** model fine tuned on the ACE2004 dataset. - The entities are one-hot encoded using the BIES (Begin/Inside/End/Single) scheme. As this is a **multi-label** model, there is no "Outside" label, for clasically outside tokens no class is predicted. - The model comes with a pipeline to extract named entities from the model predictions - For a short overview of the adaptions for **multi-label token classification**, see the non-finetuned parent model [`jvaquet/multilabel-classification-bert`](https://huggingface.co/jvaquet/multilabel-classification-bert). # Pipeline Usage Using the NER pipeline is rahter simple: ```python from transformers import pipeline pipe = pipeline(model='jvaquet/multilabel-classification-bert-ace2004', stride=128, threshold=0.5, use_hierarchy_heuristic=False, trust_remote_code=True) entities = pipe(my_text) ``` The parameters are: - `stride` - `int`: Stride for the tokenizer. When the text length exceeds `tokenizer.model_max_length`, it splits the input accordingly with the specified stride. - `threshold` - `float`: Threshold for entitiy detection. Sigmoid of the logits. - `use_hierarchy_heuristic` - `bool`: Apply heuristic to suppress additional entities when entities of same class overlap hierarchically.