--- license: mit datasets: - MultiCoNER/multiconer_v2 language: - en metrics: - f1 - precision - recall base_model: - FacebookAI/xlm-roberta-large pipeline_tag: token-classification tags: - NER - Named_Entity_Recognition pretty_name: MultiCoNER2 English XLM-RoBERTa --- **XLM-RoBERTa is fine-tuned on English [MultiCoNER2](https://huggingface.co/datasets/MultiCoNER/multiconer_v2) dataset for Fine-grained Named Entity Recognition.** The tagset of [MultiCoNER2](https://huggingface.co/datasets/MultiCoNER/multiconer_v2) is a fine-grained tagset. The fine to coarse level mapping of the tags are as follows: * Location (LOC) : Facility, OtherLOC, HumanSettlement, Station * Creative Work (CW) : VisualWork, MusicalWork, WrittenWork, ArtWork, Software * Group (GRP) : MusicalGRP, PublicCORP, PrivateCORP, AerospaceManufacturer, SportsGRP, CarManufacturer, ORG * Person (PER) : Scientist, Artist, Athlete, Politician, Cleric, SportsManager, OtherPER * Product (PROD) : Clothing, Vehicle, Food, Drink, OtherPROD * Medical (MED) : Medication/Vaccine, MedicalProcedure, AnatomicalStructure, Symptom, Disease ## Model performance: Precision: 78.29
Recall: 80.94
**F1: 79.59**
## Training Parameters: Epochs: 6
Optimizer: AdamW
Learning Rate: 5e-5
Weight Decay: 0.01
Batch Size: 64
## Citation If you use this model, please cite the following papers: ```bibtex @inproceedings{fetahu2023multiconer, title={MultiCoNER v2: a Large Multilingual dataset for Fine-grained and Noisy Named Entity Recognition}, author={Fetahu, Besnik and Chen, Zhiyu and Kar, Sudipta and Rokhlenko, Oleg and Malmasi, Shervin}, booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023}, pages={2027--2051}, year={2023} } @inproceedings{kaushik2026sampurner, title={SampurNER: Fine-grained Named Entity Recognition Dataset for 22 Indian Languages}, author={Kaushik, Prachuryya and Anand, Ashish}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={40}, year={2026} }