NER for Spanish Emergency Reports (ECU-911)

Author/Maintainer: Danny Paltin (@dannyLeo16)
Task: Token Classification (NER)
Language: Spanish (es)
Finetuned from: dccuchile/bert-base-spanish-wwm-cased
Entities (BIO): PER and LOC → ["O","B-PER","I-PER","B-LOC","I-LOC"]

This model is a Spanish BERT fine-tuned to identify persons and locations in short emergency incident descriptions (ECU-911-style). It was developed for the research project:

“Representación del conocimiento para emergencias del ECU-911 mediante PLN, ontologías OWL y reglas SWRL.”

Model Details

Architecture: BERT (Whole Word Masking, cased)
Tokenizer: dccuchile/bert-base-spanish-wwm-cased
Max length: uses base tokenizer model_max_length (padding to max length)
Libraries: 🤗 Transformers, 🤗 Datasets, PyTorch
Labels: O, B-PER, I-PER, B-LOC, I-LOC

Training Data

Source: Custom Spanish emergency reports (Ecuador, ECU-911-style) with token-level BIO annotations.
Size: 510 texts; 34,232 tokens (avg 67.12 tokens/text).
Entity counts (BIO spans): PER = 421, LOC = 1,643.
Token-level label distribution: O=30,132, B-LOC=1,643, I-LOC=1,617, B-PER=421, I-PER=419.
Splits: 80% train / 10% validation / 10% test (split aleatorio durante el entrenamiento).

Privacy/Ethics. Data should be anonymized and free of PII. Do not deploy on personal/live data without consent and compliance with local regulation.

Training Procedure

Objective: Token classification (cross-entropy); continuation subwords ignored with -100.
Hyperparameters:
- learning_rate = 2e-5
- num_train_epochs = 3
- per_device_train_batch_size = 8
- per_device_eval_batch_size = 8
- weight_decay = 0.01
- evaluation_strategy = "epoch", save_strategy = "epoch"
- load_best_model_at_end = true (por eval_loss)
Data collator: DataCollatorForTokenClassification (padding a max_length)

Evaluation

Validation (epoch 3):

Accuracy: 0.9480
Macro F1: 0.7998
Macro Precision: 0.7914
Macro Recall: 0.8118
Eval loss: 0.1458

Test:

Accuracy: 0.9740
Macro F1: 0.8899
Macro Precision: 0.8802
Macro Recall: 0.9002
Eval loss: 0.0834

(Computed with sklearn.metrics, excluding -100 positions.)

Intended Use

NER over Spanish emergency/incident text (ECU-911-like).
Downstream knowledge representation (OWL/SWRL).
Academic research and prototyping.

Limitations

Domain-specific; performance may drop on other domains.
Only PER and LOC entities.
May struggle with colloquialisms, misspellings, or code-switching.

How to use

from transformers import pipeline

ner = pipeline(
    "token-classification",
    model="dannyLeo16/ner_model_bert_base",
    tokenizer="dannyLeo16/ner_model_bert_base",
    aggregation_strategy="simple"
)
text = "Se reporta accidente en la Av. de las Américas con dos personas heridas."
ner(text)

Downloads last month: 3

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for dannyLeo16/ner_model_bert_base

Base model

dccuchile/bert-base-spanish-wwm-cased

Finetuned

(149)

this model

Evaluation results

accuracy on custom-ecu911
test set self-reported

0.974
Macro F1 on custom-ecu911
test set self-reported

0.890
Macro Precision on custom-ecu911
test set self-reported

0.880
Macro Recall on custom-ecu911
test set self-reported

0.900