--- language: ar license: apache-2.0 tags: - arabic - ner - named-entity-recognition - bert - token-classification datasets: - custom metrics: - f1 - precision - recall widget: - text: "أحمد محمد يعمل في شركة جوجل في الرياض" example_title: "Arabic NER Example" --- # MutazYoune/Arabic-NER-PII2 ## Model Description This is an Arabic Named Entity Recognition (NER) model fine-tuned on BERT architecture specifically for Arabic text processing. The model is based on `MutazYoune/ARAB_BERT` and has been trained to identify and classify named entities in Arabic text. ## Model Details - **Model Type:** Token Classification (NER) - **Language:** Arabic (ar) - **Base Model:** MutazYoune/ARAB_BERT - **Dataset:** augmented_pattern2 - **Task:** Named Entity Recognition ## Training Configuration - **Epochs:** 30 - **Batch Size:** 16 - **Learning Rate:** 3e-05 ## Supported Entity Types - CONTACT - IDENTIFIER - NETWORK - NUMERIC_ID - PII ## Usage ```python from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline # Load model and tokenizer tokenizer = AutoTokenizer.from_pretrained("MutazYoune/Arabic-NER-PII2") model = AutoModelForTokenClassification.from_pretrained("MutazYoune/Arabic-NER-PII2") # Create NER pipeline ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple") # Example usage text = "أحمد محمد يعمل في شركة جوجل في الرياض" entities = ner_pipeline(text) print(entities) ``` ## Model Performance This model was trained on the complete dataset without validation split for final production use. ## Training Data The model was trained on custom Arabic NER dataset: - Dataset type: augmented_pattern2 - Combined training and test data for final model ## Citation ```bibtex @misc{arabic-ner-bert, title={Arabic BERT NER Model}, author={Trained on Kaggle}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/MutazYoune/Arabic-NER-PII2} } ```