prachuryyaIITG
/

CLASSER_Assamese_MuRIL

Token Classification

Named_Entity_Recognition

Model card Files Files and versions

CLASSER_Assamese_MuRIL / README.md

prachuryyaIITG's picture

Update README.md

c44b02b verified 5 days ago

|

history blame contribute delete

2.6 kB

	---
	license: mit
	language:
	- as
	base_model:
	- google/muril-large-cased
	pipeline_tag: token-classification
	tags:
	- NER
	- Named_Entity_Recognition
	pretty_name: CLASSER Assamese MuRIL
	datasets:
	- prachuryyaIITG/CLASSER
	metrics:
	- f1
	- precision
	- recall
	---

	MuRIL is fine-tuned on Assamese [CLASSER](https://huggingface.co/datasets/prachuryyaIITG/CLASSER) dataset for Fine-grained Named Entity Recognition.

	The tagset of [MultiCoNER2](https://huggingface.co/datasets/MultiCoNER/multiconer_v2) is a fine-grained tagset. The fine to coarse level mapping of the tags are as follows:

	* Location (LOC) : Facility, OtherLOC, HumanSettlement, Station
	* Creative Work (CW) : VisualWork, MusicalWork, WrittenWork, ArtWork, Software
	* Group (GRP) : MusicalGRP, PublicCORP, PrivateCORP, AerospaceManufacturer, SportsGRP, CarManufacturer, ORG
	* Person (PER) : Scientist, Artist, Athlete, Politician, Cleric, SportsManager, OtherPER
	* Product (PROD) : Clothing, Vehicle, Food, Drink, OtherPROD
	* Medical (MED) : Medication/Vaccine, MedicalProcedure, AnatomicalStructure, Symptom, Disease

	## Model performance:
	Precision: 74.88 <br>
	Recall: 75.62 <br>
	F1: 75.25 <br>

	## Training Parameters:
	Epochs: 6 <br>
	Optimizer: AdamW <br>
	Learning Rate: 5e-5 <br>
	Weight Decay: 0.01 <br>
	Batch Size: 64 <br>


	## Citation

	If you use this model, please cite the following papers:

	```bibtex
	@inproceedings{kaushik2025classer,
	title = {{CLASSER}: Cross-lingual Annotation Projection enhancement through Script Similarity for Fine-grained Named Entity Recognition},
	author = {Kaushik, Prachuryya and Anand, Ashish},
	booktitle = {Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics},
	year = {2025},
	publisher = {Association for Computational Linguistics},
	note = {Main conference paper}
	}

	@inproceedings{kaushik2026sampurner,
	title={SampurNER: Fine-grained Named Entity Recognition Dataset for 22 Indian Languages},
	author={Kaushik, Prachuryya and Anand, Ashish},
	booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
	volume={40},
	year={2026}
	}

	@inproceedings{fetahu2023multiconer,
	title={MultiCoNER v2: a Large Multilingual dataset for Fine-grained and Noisy Named Entity Recognition},
	author={Fetahu, Besnik and Chen, Zhiyu and Kar, Sudipta and Rokhlenko, Oleg and Malmasi, Shervin},
	booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023},
	pages={2027--2051},
	year={2023}
	}