c5k-deberta_base-token_level-1-2

Token-level GLiNER checkpoint fine-tuned for Brazilian Portuguese PII detection. Base model: microsoft/deberta-v3-base (~NoneM params).

Results

Evaluated via the scripts in gliner-onnx-benchmark on the following holdouts (Strict = exact span, Partial = overlap).

Holdout N Strict F1 Strict P Strict R Partial F1

Intended use

Brazilian Portuguese PII detection in legal, medical, and administrative text. Supports any GLiNER-compatible label set (CPF, RG, email, phone, person name, address, etc.).

Limitations

  • Trained primarily on Portuguese text; English/Spanish performance is not guaranteed.
  • Span boundaries depend on tokenization at inference time. See the inference rule in the repo โ€” never reconstruct raw text from tokens.

Citation

@misc{arthrod_gliner_ptbr_pii,
  author = {arthrod},
  title  = {GLiNER Portuguese PII checkpoints},
  year   = {2026},
  url    = {https://huggingface.co/arthrod/c5k-deberta_base-token_level-1-2}
}
Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for arthrod/c5k-deberta_base-token_level-1-2

Quantized
(22)
this model

Spaces using arthrod/c5k-deberta_base-token_level-1-2 2