File size: 5,410 Bytes
4f1a06e 95e2b85 dfe5124 4f1a06e 95e2b85 4f1a06e efc0f32 01a3afb 0aa1b68 65cce70 dfe5124 65cce70 79a4983 65cce70 cbfd862 9ff2035 65cce70 cbfd862 31f607a cbfd862 65cce70 cbfd862 37ae9b4 65cce70 cbfd862 8a89f76 31f607a 8a89f76 65cce70 31f607a 65cce70 31f607a 65cce70 31f607a b712029 402ff30 75e1fa7 5371296 75e1fa7 402ff30 75e1fa7 5371296 57f6641 402ff30 5371296 65cce70 b712029 57f6641 b712029 65cce70 b712029 3db5f45 31f607a 95e2b85 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
---
license:
- mit
language:
- en
base_model:
- Nucha/Nucha_SkillNER_BERT
tags:
- Skills
- NER
- SkillNER
- BERT
widget:
- text: "ตัวอย่างข้อความที่ใช้ทดสอบ"
pipeline_tag: token-classification
---
# Computing Skill NER
**Nucha_SkillNER_BERT** is a Named Entity Recognition (NER) model specifically fine-tuned to recognize skill-related entities from text, focusing on identifying both hard and soft skills. This model is built on top of a BERT-based architecture, allowing it to leverage contextual understanding for accurate extraction of skill-related information. It is particularly useful for analyzing job descriptions, resumes, or any text where skills are explicitly mentioned.
The model supports the recognition of multiple skill categories, including technical skills (e.g., programming languages, software tools) and soft skills (e.g., communication, leadership). It is ideal for applications in recruitment, talent management, or skill-based data analysis.
## How to Use
You can use the **Nucha/Nucha_SkillNER_BERT** model for Named Entity Recognition (NER) by loading it directly from Hugging Face's **transformers** library. Below is an example of how to use the model with the **pipeline** API for entity extraction.
### Step-by-Step Example:
```python
# Libraly
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
# Load the pre-trained model and tokenizer
model_name = "Nucha/Nucha_SkillNER_BERT"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)
# Create a NER pipeline
ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
# Sample text
text = "I have experience in Python, JavaScript, and cloud technologies like AWS and Azure."
# Run the pipeline on the text
ner_results = ner_pipeline(text)
# Display the results
for entity in ner_results:
print(f"Entity: {entity['word']}, Label: {entity['entity_group']}, Score: {entity['score']:.4f}")
```
### Output Explanation:
- Entity: This is the word or phrase identified in the text that matches one of the model's recognized categories.
- Label: The classification label assigned to the entity, such as **SKILL** or **TECHNOLOGY** .
- Score: The confidence score of the model for the identified entity, represented as a floating-point number.
## Demo
The **Nucha/Nucha_SkillNER_BERT** model is designed for Named Entity Recognition (NER) specifically targeting skill-related entities in text. This demo allows users to input any text and see how well the model identifies different skills.
https://huggingface.co/spaces/Nucha/NuchaSkillNER
### How to Use:
- Input Text: Enter any text that contains information about skills or related topics. For example, you can input job descriptions, resumes, or any relevant text.
- Analyze: Click the "Analyze" button to run the model on the provided text. The model will process the input and extract named entities, specifically skills.
- Results: The output will display the recognized entities along with their labels and confidence scores. The labels will indicate the type of skills identified (e.g., programming languages, frameworks, tools).
## Evaluation
The **Nucha/Nucha_SkillNER_BERT** model has undergone rigorous evaluation to ensure its effectiveness in Named Entity Recognition (NER) tasks, specifically in identifying and categorizing skills relevant to various domains. The evaluation was conducted on a diverse set of datasets designed to reflect real-world scenarios.
### Metrics
The model's performance was assessed using standard NER metrics:
- **Accuracy**: Measures the overall correctness of the model's predictions.
- **Precision**: Indicates the proportion of true positive results in the total predicted positives.
- **Recall**: Reflects the ability of the model to find all relevant instances in the dataset.
- **F1 Score**: The harmonic mean of precision and recall, providing a single score that balances both metrics.
```
precision recall f1-score support
HSKILL 0.89 0.91 0.90 3708
SSKILL 0.91 0.91 0.91 2299
micro avg 0.90 0.91 0.90 6007
macro avg 0.90 0.91 0.91 6007
weighted avg 0.90 0.91 0.90 6007
Accuracy: 0.9972517975663717 (Train:5083/Test:1017)
```
#### Testing Data
You can employ this model using the Transformers library's *pipeline* for NER, or incorporate it as a conventional Transformer in the HuggingFace ecosystem.
```
1017/5083
```
### Results
You can employ this model using the Transformers library's *pipeline* for NER, or incorporate it as a conventional Transformer in the HuggingFace ecosystem.
```JSON
[
0:{
"entity":"B-HSKILL"
"score":"np.float32(0.9990522)"
"index":110
"word":"machine"
"start":581
"end":588
}
1:{
"entity":"I-HSKILL"
"score":"np.float32(0.9995209)"
"index":111
"word":"learning"
"start":589
"end":597
}
...
]
```
## Conclusion
The **Nucha/Nucha_SkillNER_BERT** model demonstrates strong performance in identifying skills in text data, making it a valuable tool for applications in recruitment, resume screening, and skill extraction tasks. Continuous improvements and further evaluations will enhance its accuracy and adaptability to specific use cases. |