Token Classification
Transformers
Safetensors
MultiLabelBert
multilabel
multilabel-token-classification
custom_code
Instructions to use jvaquet/multilabel-classification-bert-ace2004 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jvaquet/multilabel-classification-bert-ace2004 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="jvaquet/multilabel-classification-bert-ace2004", trust_remote_code=True)# Load model directly from transformers import AutoModelForTokenClassification model = AutoModelForTokenClassification.from_pretrained("jvaquet/multilabel-classification-bert-ace2004", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
| library_name: transformers | |
| tags: | |
| - multilabel | |
| - multilabel-token-classification | |
| base_model: | |
| - jvaquet/multilabel-classification-bert | |
| pipeline_tag: token-classification | |
| # Overview | |
| - This is a BERT-based **multi-label token classification** model fine tuned on the ACE2004 dataset. | |
| - The entities are one-hot encoded using the BIES (Begin/Inside/End/Single) scheme. As this is a **multi-label** model, there is no "Outside" label, for clasically outside tokens no class is predicted. | |
| - The model comes with a pipeline to extract named entities from the model predictions | |
| - For a short overview of the adaptions for **multi-label token classification**, see the non-finetuned parent model [`jvaquet/multilabel-classification-bert`](https://huggingface.co/jvaquet/multilabel-classification-bert). | |
| # Pipeline Usage | |
| Using the NER pipeline is rahter simple: | |
| ```python | |
| from transformers import pipeline | |
| pipe = pipeline(model='jvaquet/multilabel-classification-bert-ace2004', | |
| stride=128, | |
| threshold=0.5, | |
| use_hierarchy_heuristic=False, | |
| trust_remote_code=True) | |
| entities = pipe(my_text) | |
| ``` | |
| The parameters are: | |
| - `stride` - `int`: Stride for the tokenizer. When the text length exceeds `tokenizer.model_max_length`, it splits the input accordingly with the specified stride. | |
| - `threshold` - `float`: Threshold for entitiy detection. Sigmoid of the logits. | |
| - `use_hierarchy_heuristic` - `bool`: Apply heuristic to suppress additional entities when entities of same class overlap hierarchically. | |