est-roberta-ud-ner / README.md
vbius01's picture
Update README.md
df8995b verified
---
language:
- et
base_model:
- EMBEDDIA/est-roberta
pipeline_tag: token-classification
library_name: transformers
tags:
- NER
license: cc-by-4.0
---
# est-roberta-ud-ner
<!-- Provide a quick summary of what the model is/does. -->
### Model Description
<!-- Provide a longer summary of what this model is. -->
est-roberta-ud-ner is an [Est-RoBERTa](https://huggingface.co/EMBEDDIA/est-roberta) based model fine-tuned for named entity recognition in Estonian on the [EDT](https://github.com/UniversalDependencies/UD_Estonian-EDT) and [EWT](https://github.com/UniversalDependencies/UD_Estonian-EWT) datasets.
### How to use
The model can be used with Transformers pipeline for NER. Try it in Google Colab, where the Transformers library is pre-installed or on your local machine (preferably using a virtual environment, see tutorial below) and install the Transformers library using ```pip install transformers```.
```
from transformers import pipeline
ner = pipeline("ner", model="vbius01/est-roberta-ud-ner")
text = "Eesti kuulub erinevalt Lätist ja Leedust kahtlemata Põhjamaade kultuuriruumi."
results = ner(text)
print(results)
```
```
[{'entity': 'B-GEP', 'score': np.float32(0.99339926), 'index': 1, 'word': '▁Eesti', 'start': 0, 'end': 5}, {'entity': 'B-GEP', 'score': np.float32(0.9923631), 'index': 4, 'word': '▁Lätist', 'start': 22, 'end': 29}, {'entity': 'B-GEP', 'score': np.float32(0.990756), 'index': 6, 'word': '▁Leedust', 'start': 32, 'end': 40}, {'entity': 'B-LOC', 'score': np.float32(0.61792), 'index': 8, 'word': '▁Põhjamaade', 'start': 51, 'end': 62}]
```
<!-- Provide the basic links for the model. -->
- **Repository:** [github.com/martinkivisikk/ner_thesis](https://github.com/martinkivisikk/ner_thesis)
- **Paper:** [Developing a NER Model Based on Treebank Corpora]()
### Virtual environment setup
Create and activate a virtual environment in your project directory with venv.
```
python -m venv .env
source .env/bin/activate
```
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
This model can be used to find named entities from Estonian texts.