LeakPro
/

pii-classifier-tab-dataset

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions

Metrics Training metrics Community

gpadres commited on Jan 29, 2025

Commit

ebf34ea

·

1 Parent(s): d9003d8

GPJ: updating model card

Files changed (1) hide show

README.md +7 -8

README.md CHANGED Viewed

@@ -2,16 +2,15 @@
 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
-base_model: allenai/longformer-base-4096
-datasets:
-- mattmdjaga/text-anonymization-benchmark-train
 license: apache-2.0
-model_id: pii-classifier-tab-datset-1
 ---
-# Model Card for {{ base_model | default("Model ID", true) }}
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Library: [More Information Needed]
-- Docs: [More Information Needed]

 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
+datasets: mattmdjaga/text-anonymization-benchmark-train
 license: apache-2.0
+base_model: allenai/longformer-base-4096
+base_model_relation: finetune
+model_id: pii-classifier-tab-dataset
 ---
+# Model Card for pii-classifier-tab-dataset
+Model is a Longformer with a classification head, finetuned on **Text Anonymization Benchmark (TAB)** dataset for indicating if a token is part of a **Personal Identifiable Information (PII)** and should be masked out or not. Model output is the logits of the input sequence, where the classes are 1 (MASK) or 0 (NO-MASK), e.g. no IOB format used.
+Model is used as an example in [LeakPro repo](https://github.com/aidotse/LeakPro). For further detail, see example [notebook](https://github.com/aidotse/LeakPro/blob/gpj_syn_text_pii_scanner/examples/synthetic_data/syn_text_pii_scanner_example.ipynb).