File size: 3,730 Bytes
8ba8b3b
9db0aa4
 
8ba8b3b
 
 
 
9db0aa4
8ba8b3b
 
 
 
 
9db0aa4
8ba8b3b
 
 
 
 
 
 
 
9db0aa4
 
8ba8b3b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8afaa05
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8ba8b3b
 
 
 
 
 
 
 
 
 
 
 
 
 
34e4612
 
 
 
 
 
 
 
8afaa05
 
8ba8b3b
f3c8d99
 
 
 
 
 
 
 
 
 
9db0aa4
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
base_model:
- FacebookAI/xlm-roberta-large
datasets:
- MultiCoNER/multiconer_v2
language:
- zh
license: mit
metrics:
- f1
- precision
- recall
pipeline_tag: token-classification
library_name: transformers
tags:
- NER
- Named_Entity_Recognition
pretty_name: MultiCoNER2 Chinese XLM-RoBERTa
---

**XLM-RoBERTa is fine-tuned on Chinese [MultiCoNER2](https://huggingface.co/datasets/MultiCoNER/multiconer_v2) dataset for Fine-grained Named Entity Recognition.**

This model is an expert detector part of the **AWED-FiNER** project, as described in the paper: [AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers](https://huggingface.co/papers/2601.10161).

The tagset of [MultiCoNER2](https://huggingface.co/datasets/MultiCoNER/multiconer_v2) is a fine-grained tagset. The fine to coarse level mapping of the tags are as follows:

  * Location (LOC) : Facility, OtherLOC, HumanSettlement, Station
  * Creative Work (CW) : VisualWork, MusicalWork, WrittenWork, ArtWork, Software
  * Group (GRP) : MusicalGRP, PublicCORP, PrivateCORP, AerospaceManufacturer, SportsGRP, CarManufacturer, ORG
  * Person (PER) : Scientist, Artist, Athlete, Politician, Cleric, SportsManager, OtherPER
  * Product (PROD) : Clothing, Vehicle, Food, Drink, OtherPROD
  * Medical (MED) : Medication/Vaccine, MedicalProcedure, AnatomicalStructure, Symptom, Disease


## Model performance:
Precision: 64.66 <br>
Recall: 69.42 <br>
**F1: 66.95** <br>

## Training Parameters:
Epochs: 6 <br>
Optimizer: AdamW <br>
Learning Rate: 5e-5 <br>
Weight Decay: 0.01 <br>
Batch Size: 64 <br>

[**AWED-FiNER collection**](https://huggingface.co/collections/prachuryyaIITG/awed-finer) | [**Paper**](https://huggingface.co/papers/2601.10161) | [**Agentic Tool**](https://github.com/PrachuryyaKaushik/AWED-FiNER) | [**Interactive Demo**](https://huggingface.co/spaces/prachuryyaIITG/AWED-FiNER)

## Sample Usage of Agentic Tool

The AWED-FiNER agentic tool can be used to interact with expert models trained using this framework. Below is an example:
```bash
pip install smolagents gradio_client
```
```python
from tool import AWEDFiNERTool

tool = AWEDFiNERTool(
    space_id="prachuryyaIITG/AWED-FiNER"
)

result = tool.forward(
    text="Jude Bellingham joined Real Madrid in 2023.",
    language="English"
)

print(result)
```

## Citation

If you use this model, please cite the following papers:

```bibtex
@inproceedings{fetahu2023multiconer,
  title={MultiCoNER v2: a Large Multilingual dataset for Fine-grained and Noisy Named Entity Recognition},
  author={Fetahu, Besnik and Chen, Zhiyu and Kar, Sudipta and Rokhlenko, Oleg and Malmasi, Shervin},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023},
  pages={2027--2051},
  year={2023}
}

@misc{kaushik2026awedfineragentswebapplications,
      title={AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers}, 
      author={Prachuryya Kaushik and Ashish Anand},
      year={2026},
      eprint={2601.10161},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2601.10161}, 
}

@inproceedings{kaushik2026sampurner,
      title={SampurNER: Fine-Grained Named Entity Recognition Dataset for 22 Indian Languages},
      volume={40},
      url={https://ojs.aaai.org/index.php/AAAI/article/view/40405},
      DOI={10.1609/aaai.v40i37.40405},
      number={37},
      journal={Proceedings of the AAAI Conference on Artificial Intelligence},
      author={Kaushik, Prachuryya and Anand, Ashish},
      year={2026},
      month={Mar.},
      pages={31410-31418}
}
```