| --- |
| library_name: transformers |
| tags: [] |
| --- |
| |
| # Model Card for Model ID |
|
|
| <!-- Provide a quick summary of what the model is/does. --> |
|
|
| This project involves fine-tuning a BERT-based model (dslim/bert-large-NER) to perform Named Entity Recognition (NER) on mountain names in text. |
|
|
| The model has been trained to identify mentions of mountain names and differentiate them from other geographic entities or non-entities. |
|
|
| Features: |
|
|
| Fine-tuned on a custom dataset that includes sentences both with and without mountain names. |
|
|
| Uses focal loss to handle class imbalance, which ensures the model focuses on correctly classifying rare mountain names. |
|
|
| Token-level classification for identifying the B-MOUNTAIN, I-MOUNTAIN, and O (non-entity) labels. |
|
|
| Balances training between sentences with mountains (80%) and without mountains (20%). |
|
|
|
|
| ### Model Description |
|
|
| <!-- Provide a longer summary of what this model is. --> |
|
|
| This is the model card of a 🤗 transformers model that has been pushed on the Hub. |
|
|
| - **Developed by:** Oleksandr Kharytonov |
| - **Model type:** BERT |
| - **Language(s) (NLP):** Python |
| - **License:** MIT |
| - **Finetuned from model [optional]:** https://huggingface.co/dslim/bert-large-NER |
| - |
| ### Model Sources |
|
|
| <!-- Provide the basic links for the model. --> |
|
|
| - **Repository:** https://github.com/Shah1st/mountain-ner |
|
|
| ### Direct Use |
|
|
| <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. --> |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForTokenClassification |
| |
| tokenizer = AutoTokenizer.from_pretrained('./saved_model') |
| model = AutoModelForTokenClassification.from_pretrained('./saved_model') |
| ``` |
|
|
|
|
| ### Recommendations |
|
|
| <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> |
|
|
| Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. |
|
|
| ## How to Get Started with the Model |
|
|
| Use the github below to get started with the model. |
|
|
| https://github.com/Shah1st/mountain-ner |
|
|
| ## Training Details |
|
|
| ### Training Data |
|
|
| <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
|
| "DFKI-SLT/few-nerd", "supervised" |
|
|
| Filter for sentences with 'fine_ner_tags' == 24 (mountains) |
|
|
|
|
|
|
| ## Evaluation |
|
|
| <!-- This section describes the evaluation protocols and provides the results. --> |
|
|
| 'eval_loss': 0.009154710918664932, 'eval_macro_f1': 0.8952192988290304, 'eval_accuracy': 0.9746226793108054 |
|
|
| ### Testing Data, Factors & Metrics |
|
|
| #### Metrics |
|
|
| <!-- These are the evaluation metrics being used, ideally with a description of why. --> |
|
|
| macro F1: 0.895 |
| Accuracy: 0.974 |
|
|
|
|
| #### Summary |
|
|
| This project involves fine-tuning a BERT-based model (dslim/bert-large-NER) to perform Named Entity Recognition (NER) on mountain names in text. The model has been trained to identify mentions of mountain names and differentiate them from other geographic entities or non-entities. |
|
|
|
|