MutazYoune commited on
Commit
4567190
·
verified ·
1 Parent(s): f1a1c8e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -26
README.md CHANGED
@@ -7,6 +7,7 @@ tags:
7
  - named-entity-recognition
8
  - bert
9
  - token-classification
 
10
  datasets:
11
  - custom
12
  metrics:
@@ -22,15 +23,27 @@ widget:
22
 
23
  ## Model Description
24
 
25
- This is an Arabic Named Entity Recognition (NER) model fine-tuned on BERT architecture specifically for Arabic text processing. The model is based on `MutazYoune/ARAB_BERT` and has been trained to identify and classify named entities in Arabic text.
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
  ## Model Details
28
 
29
  - **Model Type:** Token Classification (NER)
30
  - **Language:** Arabic (ar)
31
  - **Base Model:** MutazYoune/ARAB_BERT
32
- - **Dataset:** augmented_pattern2
33
- - **Task:** Named Entity Recognition
34
 
35
  ## Training Configuration
36
 
@@ -65,26 +78,3 @@ ner_pipeline = pipeline("ner",
65
  text = "أحمد محمد يعمل في شركة جوجل في الرياض"
66
  entities = ner_pipeline(text)
67
  print(entities)
68
- ```
69
-
70
- ## Model Performance
71
-
72
- This model was trained on the complete dataset without validation split for final production use.
73
-
74
- ## Training Data
75
-
76
- The model was trained on custom Arabic NER dataset:
77
- - Dataset type: augmented_pattern2
78
- - Combined training and test data for final model
79
-
80
- ## Citation
81
-
82
- ```bibtex
83
- @misc{arabic-ner-bert,
84
- title={Arabic BERT NER Model},
85
- author={Trained on Kaggle},
86
- year={2025},
87
- publisher={Hugging Face},
88
- url={https://huggingface.co/MutazYoune/Arabic-NER-PII}
89
- }
90
- ```
 
7
  - named-entity-recognition
8
  - bert
9
  - token-classification
10
+ - pii
11
  datasets:
12
  - custom
13
  metrics:
 
23
 
24
  ## Model Description
25
 
26
+ This is an Arabic Named Entity Recognition (NER) model fine-tuned on BERT architecture specifically for Arabic text processing and PII detection. The model is based on `MutazYoune/ARAB_BERT` and trained to identify and mask Personally Identifiable Information (PII) in Arabic sentences.
27
+
28
+ PII categories covered include:
29
+
30
+ - Personal names (first, middle, family)
31
+ - Phone numbers
32
+ - Email addresses
33
+ - Physical addresses
34
+ - National ID numbers
35
+ - Bank account information
36
+ - Dates of birth
37
+
38
+ The model was developed and submitted as part of the **Arabic PII Redaction Challenge** hosted on Hugging Face.
39
 
40
  ## Model Details
41
 
42
  - **Model Type:** Token Classification (NER)
43
  - **Language:** Arabic (ar)
44
  - **Base Model:** MutazYoune/ARAB_BERT
45
+ - **Dataset:** augmented_pattern2 (custom Arabic PII NER dataset)
46
+ - **Task:** Named Entity Recognition and PII redaction
47
 
48
  ## Training Configuration
49
 
 
78
  text = "أحمد محمد يعمل في شركة جوجل في الرياض"
79
  entities = ner_pipeline(text)
80
  print(entities)