Igbo Sentiment Analysis Model (AfriBERTa)

Model Details

Model Description

  • Model type: Afro-centric BERT model for sequence classification
  • Architecture: Davlan/naija-twitter-sentiment-afriberta-large
  • Task: Text Classification (3-class sentiment analysis)
  • Languages: Primarily Igbo with multilingual capabilities
  • Training data: Igbo Twitter dataset (3,682 samples)
  • Classes:
    • 0: [Interpret based on your labels - e.g., Positive]
    • 1: [e.g., Negative]
    • 2: [e.g., Neutral]

Model Sources

Uses

Direct Use

Sentiment analysis for Igbo text:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("path/to/igbo-sentiment-afriberta")
tokenizer = AutoTokenizer.from_pretrained("path/to/igbo-sentiment-afriberta")

text = "Your Igbo text here"
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits).item()

Downstream Tasks

Social media sentiment monitoring

Customer feedback analysis

Content moderation for Igbo platforms

Out-of-Scope Use

Low-resource language processing without validation

Legal or medical text analysis

Training Details

Preprocessing

Text cleaning: URLs, mentions, hashtags removal

Emoji handling: Removal with emoji package

Tokenization: AfriBERTa tokenizer (128 max length)

Hyperparameters

Parameter Value Learning rate 2e-5 Batch size 16 Epochs 5 Weight decay 0.01 Warmup steps 500

Training Configuration

Framework: Hugging Face Trainer

Hardware: Single GPU (Colab environment)

Metrics: Accuracy and Weighted F1

Overall Metrics:

Accuracy: 0.80

Macro Avg F1: 0.80

Weighted Avg F1: 0.80

Limitations

Primarily optimized for Igbo Twitter data

Performance may vary with informal text or dialects

Class imbalance in training data 
Downloads last month
9
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support