NER for Spanish Emergency Reports (ECU-911)

Author/Maintainer: Danny Paltin (@dannyLeo16)
Task: Token Classification (NER)
Language: Spanish (es)
Finetuned from: dccuchile/bert-base-spanish-wwm-cased
Entities (BIO): PER and LOC["O","B-PER","I-PER","B-LOC","I-LOC"]

This model is a Spanish BERT fine-tuned to identify persons and locations in short emergency incident descriptions (ECU-911-style). It was developed for the research project:

“Representación del conocimiento para emergencias del ECU-911 mediante PLN, ontologías OWL y reglas SWRL.”


Model Details

  • Architecture: BERT (Whole Word Masking, cased)
  • Tokenizer: dccuchile/bert-base-spanish-wwm-cased
  • Max length: uses base tokenizer model_max_length (padding to max length)
  • Libraries: 🤗 Transformers, 🤗 Datasets, PyTorch
  • Labels: O, B-PER, I-PER, B-LOC, I-LOC

Training Data

  • Source: Custom Spanish emergency reports (Ecuador, ECU-911-style) with token-level BIO annotations.
  • Size: 510 texts; 34,232 tokens (avg 67.12 tokens/text).
  • Entity counts (BIO spans): PER = 421, LOC = 1,643.
  • Token-level label distribution: O=30,132, B-LOC=1,643, I-LOC=1,617, B-PER=421, I-PER=419.
  • Splits: 80% train / 10% validation / 10% test (split aleatorio durante el entrenamiento).

Privacy/Ethics. Data should be anonymized and free of PII. Do not deploy on personal/live data without consent and compliance with local regulation.


Training Procedure

  • Objective: Token classification (cross-entropy); continuation subwords ignored with -100.
  • Hyperparameters:
    • learning_rate = 2e-5
    • num_train_epochs = 3
    • per_device_train_batch_size = 8
    • per_device_eval_batch_size = 8
    • weight_decay = 0.01
    • evaluation_strategy = "epoch", save_strategy = "epoch"
    • load_best_model_at_end = true (por eval_loss)
  • Data collator: DataCollatorForTokenClassification (padding a max_length)

Evaluation

Validation (epoch 3):

  • Accuracy: 0.9480
  • Macro F1: 0.7998
  • Macro Precision: 0.7914
  • Macro Recall: 0.8118
  • Eval loss: 0.1458

Test:

  • Accuracy: 0.9740
  • Macro F1: 0.8899
  • Macro Precision: 0.8802
  • Macro Recall: 0.9002
  • Eval loss: 0.0834

(Computed with sklearn.metrics, excluding -100 positions.)


Intended Use

  • NER over Spanish emergency/incident text (ECU-911-like).
  • Downstream knowledge representation (OWL/SWRL).
  • Academic research and prototyping.

Limitations

  • Domain-specific; performance may drop on other domains.
  • Only PER and LOC entities.
  • May struggle with colloquialisms, misspellings, or code-switching.

How to use

from transformers import pipeline

ner = pipeline(
    "token-classification",
    model="dannyLeo16/ner_model_bert_base",
    tokenizer="dannyLeo16/ner_model_bert_base",
    aggregation_strategy="simple"
)
text = "Se reporta accidente en la Av. de las Américas con dos personas heridas."
ner(text)
Downloads last month
3
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dannyLeo16/ner_model_bert_base

Finetuned
(135)
this model

Evaluation results