c5k-deberta_base-token_level-1-2

Token-level GLiNER checkpoint fine-tuned for Brazilian Portuguese PII detection. Base model: microsoft/deberta-v3-base (~NoneM params).

Results

Evaluated via the scripts in gliner-onnx-benchmark on the following holdouts (Strict = exact span, Partial = overlap).

Holdout	N	Strict F1	Strict P	Strict R	Partial F1

Intended use

Brazilian Portuguese PII detection in legal, medical, and administrative text. Supports any GLiNER-compatible label set (CPF, RG, email, phone, person name, address, etc.).

Limitations

Trained primarily on Portuguese text; English/Spanish performance is not guaranteed.
Span boundaries depend on tokenization at inference time. See the inference rule in the repo — never reconstruct raw text from tokens.

Citation

@misc{arthrod_gliner_ptbr_pii,
  author = {arthrod},
  title  = {GLiNER Portuguese PII checkpoints},
  year   = {2026},
  url    = {https://huggingface.co/arthrod/c5k-deberta_base-token_level-1-2}
}

Downloads last month: 3

Model tree for arthrod/c5k-deberta_base-token_level-1-2

Base model

microsoft/deberta-v3-base

Quantized

(25)

this model