| | --- |
| | base_model: |
| | - dslim/bert-base-NER |
| | pipeline_tag: token-classification |
| | tags: |
| | - token-classification |
| | - pytorch |
| | - transformers |
| | - named-entity-recognition |
| | metrics: |
| | - seqeval |
| | --- |
| | |
| | # bert-base-mountain-NER |
| |
|
| | This model is a specialized adaptation of [dslim/bert-base-NER](https://huggingface.co/dslim/bert-base-NER), tailored for recognizing mountain names with a focus on geographical texts. Unlike the original, this model retains all 12 hidden layers and has been specifically fine-tuned to achieve high precision in identifying mountain-related entities across diverse texts. |
| |
|
| | It is ideal for applications that involve extracting geographic information from travel literature, research documents, or any content related to natural landscapes. |
| |
|
| | ## Dataset |
| |
|
| | The model was trained using approximately 115 samples generated specifically for mountain name recognition. These samples were created with the assistance of ChatGPT, focusing on realistic use cases for mountain-related content in the NER format. |
| |
|
| | ## How to Use |
| |
|
| | You can easily integrate this model with the Transformers library's NER pipeline: |
| |
|
| | ```python |
| | import torch |
| | from transformers import AutoTokenizer, AutoModelForTokenClassification |
| | from transformers import pipeline |
| | |
| | device = "cuda" if torch.cuda.is_available() else "cpu" |
| | |
| | # Load model and tokenizer |
| | model_name = "Lizrek/bert-base-mountain-NER" |
| | tokenizer = AutoTokenizer.from_pretrained(model_name) |
| | model = AutoModelForTokenClassification.from_pretrained(model_name) |
| | |
| | # Create a pipeline for NER |
| | nlp = pipeline("ner", model=model, tokenizer=tokenizer) |
| | |
| | # Example usage |
| | example = "Mount Fuji in Japan are example of volcanic mountain.." |
| | ner_results = nlp(example) |
| | print(ner_results) |
| | ``` |
| |
|
| | ## Example Output |
| |
|
| | For the above input, the model provides the following output: |
| |
|
| | ```python |
| | [{'entity': 'B-MOUNTAIN_NAME', 'score': np.float32(0.9827131), 'index': 1, 'word': 'Mount', 'start': 0, 'end': 5}, {'entity': 'I-MOUNTAIN_NAME', 'score': np.float32(0.98952174), 'index': 2, 'word': 'Fuji', 'start': 6, 'end': 10}] |
| | ``` |
| |
|
| | This output highlights recognized mountain names, providing metadata such as entity type, confidence score, and word position. |
| |
|
| | ## Limitations |
| |
|
| | - The model is specialized for mountain names and may not be effective in recognizing other types of geographical entities such as rivers or lakes. |
| | - If the input text is significantly different from the training data in style or terminology, accuracy may be affected. |