Instructions to use marcosgg/bert-large-pt-ner-enamex with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use marcosgg/bert-large-pt-ner-enamex with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="marcosgg/bert-large-pt-ner-enamex")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("marcosgg/bert-large-pt-ner-enamex") model = AutoModelForTokenClassification.from_pretrained("marcosgg/bert-large-pt-ner-enamex") - Notebooks
- Google Colab
- Kaggle
Named Entity Recognition (NER) model for Portuguese
This is a NER model for Portuguese which uses the standard 'enamex' classes: LOC (geographical locations); PER (people); ORG (organizations); MISC (other entities).
The model is based on BERTimbau Large, which has been fine-tuned using a combination of available corpora (see [1] for details).
There is an alternative model trained using BERTimbau Base: bert-base-pt-ner-enamex.
It was trained with a batch size of 32 and a learning rate of 3e-5 during 3 epochs. It achieved the following results on the test set (Precision/Recall/F1): 0.919/0.925/0.922.
[1] Pablo Gamallo, Marcos Garcia & Patricia Martín-Rodilla, 2019. NER and open information extraction for Portuguese notebook for IberLEF 2019 Portuguese named entity recognition and relation extraction tasks. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019) co-located with 35th Conference of the Spanish Society for Natural Language Processing (SEPLN 2019): 457-467.
- Downloads last month
- 31