| | --- |
| | base_model: |
| | - google/muril-large-cased |
| | datasets: |
| | - prachuryyaIITG/CLASSER |
| | language: |
| | - as |
| | license: mit |
| | metrics: |
| | - f1 |
| | - precision |
| | - recall |
| | pipeline_tag: token-classification |
| | tags: |
| | - NER |
| | - Named_Entity_Recognition |
| | pretty_name: CLASSER Assamese MuRIL |
| | library_name: transformers |
| | --- |
| | |
| | **MuRIL is fine-tuned on Assamese [CLASSER](https://huggingface.co/datasets/prachuryyaIITG/CLASSER) dataset for Fine-grained Named Entity Recognition.** |
| |
|
| | This model is part of the **AWED-FiNER** project, which provides fine-grained NER solutions across 36 languages. |
| |
|
| | - **Paper:** [AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers](https://huggingface.co/papers/2601.10161) |
| | - **GitHub:** https://github.com/PrachuryyaKaushik/AWED-FiNER |
| | - **Interactive Demo:** [AWED-FiNER Space](https://huggingface.co/spaces/prachuryyaIITG/AWED-FiNER) |
| |
|
| | The tagset of [MultiCoNER2](https://huggingface.co/datasets/MultiCoNER/multiconer_v2) is a fine-grained tagset. The fine to coarse level mapping of the tags are as follows: |
| |
|
| | * Location (LOC) : Facility, OtherLOC, HumanSettlement, Station |
| | * Creative Work (CW) : VisualWork, MusicalWork, WrittenWork, ArtWork, Software |
| | * Group (GRP) : MusicalGRP, PublicCORP, PrivateCORP, AerospaceManufacturer, SportsGRP, CarManufacturer, ORG |
| | * Person (PER) : Scientist, Artist, Athlete, Politician, Cleric, SportsManager, OtherPER |
| | * Product (PROD) : Clothing, Vehicle, Food, Drink, OtherPROD |
| | * Medical (MED) : Medication/Vaccine, MedicalProcedure, AnatomicalStructure, Symptom, Disease |
| |
|
| | ## Model performance: |
| | Precision: 74.88 <br> |
| | Recall: 75.62 <br> |
| | **F1: 75.25** <br> |
| |
|
| | ## Training Parameters: |
| | Epochs: 6 <br> |
| | Optimizer: AdamW <br> |
| | Learning Rate: 5e-5 <br> |
| | Weight Decay: 0.01 <br> |
| | Batch Size: 64 <br> |
| |
|
| | ## Contributors |
| | [Prachuryya Kaushik](https://www.linkedin.com/in/pkabundant/) <br> |
| | [Prof. Ashish Anand](https://www.linkedin.com/in/anandashish/) |
| |
|
| | ## Sample Usage |
| |
|
| | The AWED-FiNER agentic tool can be used to interact with expert models trained using this framework. Below is an example: |
| | ```bash |
| | pip install smolagents gradio_client |
| | ``` |
| | ```python |
| | from tool import AWEDFiNERTool |
| | |
| | tool = AWEDFiNERTool( |
| | space_id="prachuryyaIITG/AWED-FiNER" |
| | ) |
| | |
| | result = tool.forward( |
| | text="Jude Bellingham joined Real Madrid in 2023.", |
| | language="English" |
| | ) |
| | |
| | print(result) |
| | ``` |
| |
|
| | ## Citation |
| |
|
| | If you use this model, please cite the following papers: |
| |
|
| | ```bibtex |
| | @inproceedings{kaushik-anand-2025-classer, |
| | title = "{CLASSER}: Cross-lingual Annotation Projection enhancement through Script Similarity for Fine-grained Named Entity Recognition", |
| | author = "Kaushik, Prachuryya and |
| | Anand, Ashish", |
| | booktitle = "Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics", |
| | month = dec, |
| | year = "2025", |
| | address = "Mumbai, India", |
| | publisher = "The Asian Federation of Natural Language Processing and The Association for Computational Linguistics", |
| | url = "https://aclanthology.org/2025.ijcnlp-long.94/", |
| | pages = "1745--1760", |
| | ISBN = "979-8-89176-298-5", |
| | } |
| | |
| | @misc{kaushik2026awedfineragentswebapplications, |
| | title={AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers}, |
| | author={Prachuryya Kaushik and Ashish Anand}, |
| | year={2026}, |
| | eprint={2601.10161}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL}, |
| | url={https://arxiv.org/abs/2601.10161}, |
| | } |
| | |
| | @inproceedings{kaushik2026sampurner, |
| | title={SampurNER: Fine-grained Named Entity Recognition Dataset for 22 Indian Languages}, |
| | author={Kaushik, Prachuryya and Anand, Ashish}, |
| | booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, |
| | volume={40}, |
| | year={2026} |
| | } |
| | |
| | @inproceedings{fetahu2023multiconer, |
| | title={MultiCoNER v2: a Large Multilingual dataset for Fine-grained and Noisy Named Entity Recognition}, |
| | author={Fetahu, Besnik and Chen, Zhiyu and Kar, Sudipta and Oleg and Malmasi, Shervin}, |
| | booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023}, |
| | pages={2027--2051}, |
| | year={2023} |
| | } |
| | ``` |