Shah1st
/

mountain-ner-model

Token Classification

Model card Files Files and versions

mountain-ner-model / README.md

Shah1st's picture

Update README.md

68b1f59 verified over 1 year ago

|

history blame contribute delete

3.07 kB

	---
	library_name: transformers
	tags: []
	---

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->

	This project involves fine-tuning a BERT-based model (dslim/bert-large-NER) to perform Named Entity Recognition (NER) on mountain names in text.

	The model has been trained to identify mentions of mountain names and differentiate them from other geographic entities or non-entities.

	Features:

	Fine-tuned on a custom dataset that includes sentences both with and without mountain names.

	Uses focal loss to handle class imbalance, which ensures the model focuses on correctly classifying rare mountain names.

	Token-level classification for identifying the B-MOUNTAIN, I-MOUNTAIN, and O (non-entity) labels.

	Balances training between sentences with mountains (80%) and without mountains (20%).


	### Model Description

	<!-- Provide a longer summary of what this model is. -->

	This is the model card of a 🤗 transformers model that has been pushed on the Hub.

	- Developed by: Oleksandr Kharytonov
	- Model type: BERT
	- Language(s) (NLP): Python
	- License: MIT
	- Finetuned from model [optional]: https://huggingface.co/dslim/bert-large-NER
	-
	### Model Sources

	<!-- Provide the basic links for the model. -->

	- Repository: https://github.com/Shah1st/mountain-ner

	### Direct Use

	<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

	```python
	from transformers import AutoTokenizer, AutoModelForTokenClassification

	tokenizer = AutoTokenizer.from_pretrained('./saved_model')
	model = AutoModelForTokenClassification.from_pretrained('./saved_model')
	```


	### Recommendations

	<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

	Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

	## How to Get Started with the Model

	Use the github below to get started with the model.

	https://github.com/Shah1st/mountain-ner

	## Training Details

	### Training Data

	<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

	"DFKI-SLT/few-nerd", "supervised"

	Filter for sentences with 'fine_ner_tags' == 24 (mountains)



	## Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->

	'eval_loss': 0.009154710918664932, 'eval_macro_f1': 0.8952192988290304, 'eval_accuracy': 0.9746226793108054

	### Testing Data, Factors & Metrics

	#### Metrics

	<!-- These are the evaluation metrics being used, ideally with a description of why. -->

	macro F1: 0.895
	Accuracy: 0.974


	#### Summary

	This project involves fine-tuning a BERT-based model (dslim/bert-large-NER) to perform Named Entity Recognition (NER) on mountain names in text. The model has been trained to identify mentions of mountain names and differentiate them from other geographic entities or non-entities.