NASK-PIB
/

BANonymizer-PL

Model card Files Files and versions

klorenc commited on Feb 3, 2025

Commit

bfbf526

·

verified ·

1 Parent(s): d42419b

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ license: apache-2.0
 This model is a fine-tuned version of [HerBERT-large-cased](https://huggingface.co/allegro/herbert-large-cased), a Polish language model developed by Allegro, specialized in anonymizing sensitive and personal information in Polish texts.
 ## Training and Purpose
-The model has been fine-tuned on the [BAN-PL dataset](https://github.com/ZILiAT-NASK/BAN-PL/tree/main), which contains over 20,000 manually labeled examples and a test set of more than 2,000 examples. It is designed to detect and anonymize entities such as surnames and pseudonyms.
 ## Applications
 This model is particularly useful for privacy-preserving tasks, such as anonymizing datasets for research purposes. Unlike other publicly available tools that primarily focus on surnames, this model uniquely handles both surnames and pseudonyms, enhancing its utility in various anonymization workflows.

 This model is a fine-tuned version of [HerBERT-large-cased](https://huggingface.co/allegro/herbert-large-cased), a Polish language model developed by Allegro, specialized in anonymizing sensitive and personal information in Polish texts.
 ## Training and Purpose
+The model has been fine-tuned on the [BAN-PL dataset](https://github.com/ZILiAT-NASK/BAN-PL/tree/main), which contains over 20,000 manually labeled examples and a test set of more than 2,000 examples. It is designed to detect and anonymize entities such as pseudonyms and surnames, except for deceased individuals, historical figures, and fictional characters.
 ## Applications
 This model is particularly useful for privacy-preserving tasks, such as anonymizing datasets for research purposes. Unlike other publicly available tools that primarily focus on surnames, this model uniquely handles both surnames and pseudonyms, enhancing its utility in various anonymization workflows.