|
|
--- |
|
|
license: mit |
|
|
datasets: |
|
|
- HausaNLP/NaijaSenti-Twitter |
|
|
language: |
|
|
- ha |
|
|
metrics: |
|
|
- accuracy |
|
|
- f1 |
|
|
- precision |
|
|
- recall |
|
|
base_model: google-bert/bert-base-cased |
|
|
pipeline_tag: text-classification |
|
|
library_name: transformers |
|
|
tags: |
|
|
- NLP |
|
|
- sentiment-analysis |
|
|
- hausa |
|
|
--- |
|
|
|
|
|
**Model Name**: Hausa Sentiment Analysis |
|
|
**Model ID**: `Kumshe/Hausa-sentiment-analysis` |
|
|
**Language**: Hausa |
|
|
|
|
|
--- |
|
|
|
|
|
### **Model Description** |
|
|
This model is a BERT-based model fine-tuned for sentiment analysis in the Hausa language. It is trained to classify social media text into different sentiment categories: positive, negative, or neutral. |
|
|
|
|
|
### **Intended Use** |
|
|
- **Primary Use Case**: Sentiment analysis for Hausa social media content, such as tweets or Facebook posts. |
|
|
- **Target Users**: NLP researchers, businesses analyzing social media, and developers building sentiment analysis tools for Hausa language content. |
|
|
- **Example Usage**: |
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
|
|
# Load the model and tokenizer |
|
|
tokenizer = AutoTokenizer.from_pretrained("Kumshe/Hausa-sentiment-analysis") |
|
|
model = AutoModelForSequenceClassification.from_pretrained("Kumshe/Hausa-sentiment-analysis") |
|
|
|
|
|
# Encode the input text |
|
|
inputs = tokenizer("Your Hausa text here", return_tensors="pt") |
|
|
|
|
|
# Get model predictions |
|
|
outputs = model(**inputs) |
|
|
``` |
|
|
|
|
|
### **Model Architecture** |
|
|
- **Base Model**: BERT (Bidirectional Encoder Representations from Transformers) |
|
|
- **Pre-trained Model**: `bert-base-cased` from Hugging Face Transformers library. |
|
|
- **Fine-Tuned Model**: Fine-tuned for 40 epochs on a Hausa sentiment dataset. |
|
|
|
|
|
### **Training Data** |
|
|
- **Data Source**: The model was trained on a dataset containing 35,000 examples from social media platforms such as Twitter and Facebook. |
|
|
- **Data Split**: |
|
|
- **Training Set**: 80% of the data |
|
|
- **Validation Set**: 20% of the data |
|
|
|
|
|
### **Training Details** |
|
|
- **Number of Epochs**: 40 |
|
|
- **Batch Size**: |
|
|
- Per device training batch size: 32 |
|
|
- Per device evaluation batch size: 64 |
|
|
- **Learning Rate Schedule**: Warm-up steps: 10, Weight decay: 0.01 |
|
|
- **Optimizer**: AdamW |
|
|
- **Training Hardware**: Trained on Kaggle using 2 NVIDIA T4 GPUs. |
|
|
|
|
|
### **Evaluation Metrics** |
|
|
- **Evaluation Loss**: 0.6265 |
|
|
- **Accuracy**: 73.47% |
|
|
- **F1 Score**: 73.47% |
|
|
- **Precision**: 73.54% |
|
|
- **Recall**: 73.47% |
|
|
|
|
|
### **Model Performance** |
|
|
The model performs well on the given dataset, achieving a balanced performance between precision, recall, and F1 score, making it suitable for general sentiment analysis tasks in Hausa language text. |
|
|
|
|
|
### **Limitations** |
|
|
- The model may not generalize well to other types of Hausa text outside of social media (e.g., formal writing or literature). |
|
|
- Performance may degrade on text containing slang or regional dialects not well-represented in the training data. |
|
|
- The model is biased towards the examples in the training dataset; biases in the data may affect predictions. |
|
|
|
|
|
### **Ethical Considerations** |
|
|
- Sentiment analysis models can potentially amplify biases present in the training data. |
|
|
- Use cautiously in sensitive applications to avoid unintended consequences. |
|
|
- Consider the impact on privacy and data protection laws, especially when analyzing social media content. |
|
|
|
|
|
### **License** |
|
|
- |
|
|
|
|
|
### **Citation** |
|
|
If you use this model in your work, please cite it as follows: |
|
|
``` |
|
|
@misc{Kumshe2024HausaSentimentAnalysis, |
|
|
author = {Umar Muhammad Mustapha Kumshe}, |
|
|
title = {Hausa Sentiment Analysis}, |
|
|
year = {2024}, |
|
|
publisher = {Hugging Face}, |
|
|
howpublished = {\url{https://huggingface.co/Kumshe/Hausa-sentiment-analysis}}, |
|
|
} |
|
|
``` |
|
|
|
|
|
### **Contributions** |
|
|
This model was fine-tuned by Umar Muhammad Mustapha Kumshe. Feel free to contribute, provide feedback, or raise issues on the [model repository](https://huggingface.co/Kumshe/Hausa-sentiment-analysis). |