File size: 1,501 Bytes

8804863
 
fda54ce
 
 
 
 
 
8804863
b5cbe17
 
 
6a7d9af
b5cbe17

---
library_name: transformers
tags:
- multilabel
- multilabel-token-classification
base_model:
- jvaquet/multilabel-classification-bert
pipeline_tag: token-classification
---

# Overview
- This is a BERT-based **multi-label token classification** model fine tuned on the ACE2004 dataset.
- The entities are one-hot encoded using the BIES (Begin/Inside/End/Single) scheme. As this is a **multi-label** model, there is no "Outside" label, for clasically outside tokens no class is predicted.
- The model comes with a pipeline to extract named entities from the model predictions
- For a short overview of the adaptions for **multi-label token classification**, see the non-finetuned parent model [`jvaquet/multilabel-classification-bert`](https://huggingface.co/jvaquet/multilabel-classification-bert).

# Pipeline Usage
Using the NER pipeline is rahter simple:
```python
from transformers import pipeline

pipe = pipeline(model='jvaquet/multilabel-classification-bert-ace2004',
  stride=128,
  threshold=0.5,
  use_hierarchy_heuristic=False,
  trust_remote_code=True)

entities = pipe(my_text)
```
The parameters are:
- `stride` - `int`: Stride for the tokenizer. When the text length exceeds `tokenizer.model_max_length`, it splits the input accordingly with the specified stride.
- `threshold` - `float`: Threshold for entitiy detection. Sigmoid of the logits.
- `use_hierarchy_heuristic` - `bool`: Apply heuristic to suppress additional entities when entities of same class overlap hierarchically.