jvaquet
/

multilabel-classification-bert-ace2004

Token Classification

multilabel-token-classification

Model card Files Files and versions

multilabel-classification-bert-ace2004 / README.md

jvaquet's picture

Update README.md

6a7d9af verified about 1 month ago

|

history blame contribute delete

1.5 kB

	---
	library_name: transformers
	tags:
	- multilabel
	- multilabel-token-classification
	base_model:
	- jvaquet/multilabel-classification-bert
	pipeline_tag: token-classification
	---

	# Overview
	- This is a BERT-based multi-label token classification model fine tuned on the ACE2004 dataset.
	- The entities are one-hot encoded using the BIES (Begin/Inside/End/Single) scheme. As this is a multi-label model, there is no "Outside" label, for clasically outside tokens no class is predicted.
	- The model comes with a pipeline to extract named entities from the model predictions
	- For a short overview of the adaptions for multi-label token classification, see the non-finetuned parent model [`jvaquet/multilabel-classification-bert`](https://huggingface.co/jvaquet/multilabel-classification-bert).

	# Pipeline Usage
	Using the NER pipeline is rahter simple:
	```python
	from transformers import pipeline

	pipe = pipeline(model='jvaquet/multilabel-classification-bert-ace2004',
	stride=128,
	threshold=0.5,
	use_hierarchy_heuristic=False,
	trust_remote_code=True)

	entities = pipe(my_text)
	```
	The parameters are:
	- `stride` - `int`: Stride for the tokenizer. When the text length exceeds `tokenizer.model_max_length`, it splits the input accordingly with the specified stride.
	- `threshold` - `float`: Threshold for entitiy detection. Sigmoid of the logits.
	- `use_hierarchy_heuristic` - `bool`: Apply heuristic to suppress additional entities when entities of same class overlap hierarchically.