File size: 2,092 Bytes
eb2660e
9e13df3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eb2660e
 
9e13df3
eb2660e
9e13df3
eb2660e
9e13df3
eb2660e
 
 
9e13df3
 
 
 
 
eb2660e
9e13df3
eb2660e
9e13df3
 
 
eb2660e
9e13df3
eb2660e
9e13df3
 
 
 
 
eb2660e
9e13df3
eb2660e
9e13df3
 
eb2660e
9e13df3
 
 
eb2660e
9e13df3
 
 
 
 
eb2660e
9e13df3
 
 
 
 
eb2660e
9e13df3
eb2660e
9e13df3
eb2660e
9e13df3
eb2660e
9e13df3
 
 
eb2660e
9e13df3
eb2660e
9e13df3
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
language: ar
license: apache-2.0
tags:
- arabic
- ner
- named-entity-recognition
- bert
- token-classification
datasets:
- custom
metrics:
- f1
- precision
- recall
widget:
- text: "أحمد محمد يعمل في شركة جوجل في الرياض"
  example_title: "Arabic NER Example"
---

# MutazYoune/Arabic-NER-PII2

## Model Description

This is an Arabic Named Entity Recognition (NER) model fine-tuned on BERT architecture specifically for Arabic text processing. The model is based on `MutazYoune/ARAB_BERT` and has been trained to identify and classify named entities in Arabic text.

## Model Details

- **Model Type:** Token Classification (NER)
- **Language:** Arabic (ar)
- **Base Model:** MutazYoune/ARAB_BERT
- **Dataset:** augmented_pattern2
- **Task:** Named Entity Recognition

## Training Configuration

- **Epochs:** 30
- **Batch Size:** 16
- **Learning Rate:** 3e-05

## Supported Entity Types

- CONTACT
- IDENTIFIER
- NETWORK
- NUMERIC_ID
- PII

## Usage

```python
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("MutazYoune/Arabic-NER-PII2")
model = AutoModelForTokenClassification.from_pretrained("MutazYoune/Arabic-NER-PII2")

# Create NER pipeline
ner_pipeline = pipeline("ner", 
                       model=model, 
                       tokenizer=tokenizer,
                       aggregation_strategy="simple")

# Example usage
text = "أحمد محمد يعمل في شركة جوجل في الرياض"
entities = ner_pipeline(text)
print(entities)
```

## Model Performance

This model was trained on the complete dataset without validation split for final production use.

## Training Data

The model was trained on custom Arabic NER dataset:
- Dataset type: augmented_pattern2
- Combined training and test data for final model

## Citation

```bibtex
@misc{arabic-ner-bert,
  title={Arabic BERT NER Model},
  author={Trained on Kaggle},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/MutazYoune/Arabic-NER-PII2}
}
```