mountain-ner-model / README.md
Shah1st's picture
Update README.md
68b1f59 verified
---
library_name: transformers
tags: []
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This project involves fine-tuning a BERT-based model (dslim/bert-large-NER) to perform Named Entity Recognition (NER) on mountain names in text.
The model has been trained to identify mentions of mountain names and differentiate them from other geographic entities or non-entities.
Features:
Fine-tuned on a custom dataset that includes sentences both with and without mountain names.
Uses focal loss to handle class imbalance, which ensures the model focuses on correctly classifying rare mountain names.
Token-level classification for identifying the B-MOUNTAIN, I-MOUNTAIN, and O (non-entity) labels.
Balances training between sentences with mountains (80%) and without mountains (20%).
### Model Description
<!-- Provide a longer summary of what this model is. -->
This is the model card of a 🤗 transformers model that has been pushed on the Hub.
- **Developed by:** Oleksandr Kharytonov
- **Model type:** BERT
- **Language(s) (NLP):** Python
- **License:** MIT
- **Finetuned from model [optional]:** https://huggingface.co/dslim/bert-large-NER
-
### Model Sources
<!-- Provide the basic links for the model. -->
- **Repository:** https://github.com/Shah1st/mountain-ner
### Direct Use
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained('./saved_model')
model = AutoModelForTokenClassification.from_pretrained('./saved_model')
```
### Recommendations
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
## How to Get Started with the Model
Use the github below to get started with the model.
https://github.com/Shah1st/mountain-ner
## Training Details
### Training Data
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
"DFKI-SLT/few-nerd", "supervised"
Filter for sentences with 'fine_ner_tags' == 24 (mountains)
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
'eval_loss': 0.009154710918664932, 'eval_macro_f1': 0.8952192988290304, 'eval_accuracy': 0.9746226793108054
### Testing Data, Factors & Metrics
#### Metrics
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
macro F1: 0.895
Accuracy: 0.974
#### Summary
This project involves fine-tuning a BERT-based model (dslim/bert-large-NER) to perform Named Entity Recognition (NER) on mountain names in text. The model has been trained to identify mentions of mountain names and differentiate them from other geographic entities or non-entities.