sdocio
/

es_trf_ner_cds_bne-base

Token Classification

Token Classification

roberta-base-bne

Model card Files Files and versions

sdocio commited on Dec 31, 2022

Commit

6fa208c

·

1 Parent(s): 0913595

Add README.md

Files changed (1) hide show

README.md +72 -0

README.md CHANGED Viewed

@@ -1,3 +1,75 @@
 ---
 license: gpl-3.0
 ---

 ---
+language: es
 license: gpl-3.0
+tags:
+- PyTorch
+- Transformers
+- Token Classification
+- roberta
+- roberta-base-bne
+widget:
+- text: "Fue antes de llegar a Sigüeiro, en el Camino de Santiago."
+- text: "El proyecto lo financia el Ministerio de Industria y Competitividad."
+model-index:
+- name: roberta-bne-ner-cds
+  results: []
 ---
+# Introduction
+This model is a fine-tuned version of [roberta-base-bne](https://huggingface.co/PlanTL-GOB-ES/roberta-base-bne) for Named-Entity Recognition, in the domain of tourism related to the Way of Saint Jacques. It recognizes four types of entities: location (LOC), organizations (ORG), person (PER) and miscellaneous (MISC).
+## Usage
+You can use this model with Transformers *pipeline* for NER.
+```python
+from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
+tokenizer = AutoTokenizer.from_pretrained("roberta-bne-ner-cds")
+model = AutoModelForTokenClassification.from_pretrained("roberta-bne-ner-cds")
+example = "Fue antes de llegar a Sigüeiro, en el Camino de Santiago. El proyecto lo financia el Ministerio de Industria y Competitividad."
+ner_pipe = pipeline('ner', model=model, tokenizer=tokenizer, aggregation_strategy="simple")
+for ent in ner_pipe(example):
+    print(ent)
+```
+```
+{'entity_group': 'LOC', 'score': 0.99795026, 'word': ' Sigüeiro', 'start': 22, 'end': 30}
+{'entity_group': 'LOC', 'score': 0.997823, 'word': ' Camino de Santiago', 'start': 38, 'end': 56}
+{'entity_group': 'ORG', 'score': 0.98481846, 'word': ' Ministerio de Industria y Competitividad', 'start': 85, 'end': 125}
+```
+## Model performance
+entity|precision|recall|f1
+-|-|-|-
+PER|0.965|0.924|0.944
+ORG|0.900|0.701|0.788
+LOC|0.982|0.985|0.983
+MISC|0.798|0.874|0.834
+micro avg|0.964|0.968|0.966
+macro avg|0.911|0.871|0.887
+weighted avg|0.965|0.968|0.966
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 32
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 3.0
+### Framework versions
+- Transformers 4.25.1
+- Pytorch 1.13.0+cu117
+- Datasets 2.7.1
+- Tokenizers 0.13.2