File size: 4,456 Bytes
eece4fe ba8a2d1 7896663 ba8a2d1 eece4fe ba8a2d1 eece4fe ba8a2d1 eece4fe ba8a2d1 eece4fe 1b89814 eece4fe 1b89814 eece4fe 1b89814 eece4fe 1b89814 eece4fe 1b89814 eece4fe 1eed622 1b89814 eece4fe ba8a2d1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 | ---
base_model:
- google/muril-large-cased
datasets:
- prachuryyaIITG/APTFiNER
language:
- as
license: mit
metrics:
- f1
- precision
- recall
pipeline_tag: token-classification
tags:
- NER
- Named_Entity_Recognition
pretty_name: APTFiNER Assamese XLM-R
library_name: transformers
---
**This model is fine-tuned on the Assamese APTFiNER dataset for Fine-grained Named Entity Recognition.**
It is part of the **AWED-FiNER** collection, as presented in the paper [AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers](https://huggingface.co/papers/2601.10161).
- **Code:** [GitHub - AWED-FiNER](https://github.com/PrachuryyaKaushik/AWED-FiNER)
- **Interactive Demo:** [Hugging Face Space](https://huggingface.co/spaces/prachuryyaIITG/AWED-FiNER)
The tagset of [MultiCoNER2](https://huggingface.co/datasets/MultiCoNER/multiconer_v2) is a fine-grained tagset. The fine to coarse level mapping of the tags are as follows:
* Location (LOC) : Facility, OtherLOC, HumanSettlement, Station
* Creative Work (CW) : VisualWork, MusicalWork, WrittenWork, ArtWork, Software
* Group (GRP) : MusicalGRP, PublicCORP, PrivateCORP, AerospaceManufacturer, SportsGRP, CarManufacturer, ORG
* Person (PER) : Scientist, Artist, Athlete, Politician, Cleric, SportsManager, OtherPER
* Product (PROD) : Clothing, Vehicle, Food, Drink, OtherPROD
* Medical (MED) : Medication/Vaccine, MedicalProcedure, AnatomicalStructure, Symptom, Disease
## Model performance:
Precision: 62.62 <br>
Recall: 67.98 <br>
**F1: 65.19** <br>
## Training Parameters:
Epochs: 6 <br>
Optimizer: AdamW <br>
Learning Rate: 5e-5 <br>
Weight Decay: 0.01 <br>
Batch Size: 64 <br>
## Contributors
[Prachuryya Kaushik](https://www.linkedin.com/in/pkabundant/) <br>
[Adittya Gupta](https://www.linkedin.com/in/adittya-gupta-b64356224/) <br>
[Ajanta Maurya](https://www.linkedin.com/in/ajanta-maurya/) <br>
[Gautam Sharma](https://www.linkedin.com/in/g-s01/) <br>
[Prof. V Vijaya Saradhi](https://www.linkedin.com/in/vijaya-saradhi-a90a604/) <br>
[Prof. Ashish Anand](https://www.linkedin.com/in/anandashish/)
APTFiNER is a part of the [AWED-FiNER collection](https://huggingface.co/collections/prachuryyaIITG/awed-finer). Please check: [**Paper**](https://huggingface.co/papers/2601.10161) | [**Agentic Tool**](https://github.com/PrachuryyaKaushik/AWED-FiNER) | [**Interactive Demo**](https://huggingface.co/spaces/prachuryyaIITG/AWED-FiNER)
## Sample Usage
The AWED-FiNER agentic tool can be used to interact with expert models trained using this framework. Below is an example:
```bash
pip install smolagents gradio_client
```
```python
from tool import AWEDFiNERTool
tool = AWEDFiNERTool(
space_id="prachuryyaIITG/AWED-FiNER"
)
result = tool.forward(
text="Jude Bellingham joined Real Madrid in 2023.",
language="English"
)
print(result)
```
## Citation
If you use this model, please cite the following papers:
```bibtex
@inproceedings{kaushik2026aptfiner,
title={APTFiNER: Annotation Preserving Translation for Fine-grained Named Entity Recognition},
author={Kaushik, Prachuryya and Gupta, Adittya and Maurya, Ajanta and Sharma, Gautam and Saradhi, Vijaya V and Anand, Ashish},
booktitle={Proceedings of the Fifteenth Language Resources and Evaluation Conference},
volume={15},
year={2026}
}
@misc{kaushik2026awedfineragentswebapplications,
title={AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers},
author={Prachuryya Kaushik and Ashish Anand},
year={2026},
eprint={2601.10161},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2601.10161},
}
@inproceedings{kaushik2026sampurner,
title={SampurNER: Fine-grained Named Entity Recognition Dataset for 22 Indian Languages},
author={Kaushik, Prachuryya and Anand, Ashish},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={40},
year={2026}
}
@inproceedings{fetahu2023multiconer,
title={MultiCoNER v2: a Large Multilingual dataset for Fine-grained and Noisy Named Entity Recognition},
author={Fetahu, Besnik and Chen, Zhiyu and Kar, Sudipta and Rokhlenko, Oleg and Malmasi, Shervin},
booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023},
pages={2027--2051},
year={2023}
}
``` |