Model Card for PuoBERTa Offensive Language Detection (Setswana)

Model Summary

A fine-tuned transformer model based on PuoBERTa for binary classification of Setswana text into:

  • Offensive (1)
  • Non-offensive (0)

The model is intended for digital forensic investigations and cybercrime analysis involving Setswana-language social media text.

Intended Use

  • Digital forensics
  • Cyberbullying detection
  • Hate speech detection
  • Research on low-resource African languages

Not intended for:

  • Fully automated decision-making
  • Legal action without human review

Training Data

  • 977 entries total
    • 477 Offensive
    • 500 Non-offensive
  • Sourced from public Facebook posts and comments
  • Annotated with semantic trigger tags (removed at test-time)

Training Procedure

  • Model: roberta-base architecture (PuoBERTa variant)
  • Epochs: 10 (final model)
  • Batch size: 16 (train), 64 (eval)
  • Learning rate: 1e-5
  • Weight decay: 0.01
  • Loss: class-weighted cross-entropy [1.0, 2.0]

Evaluation Results

The model was evaluated on a held-out 20% test set that was kept completely unseen during training and cross-validation.
The following metrics were computed:

Test Set Performance

Metric Value
Accuracy 0.8673
Macro F1-score 0.8662
Recall (Offensive = 1) 0.8444
Matthews Correlation Coefficient (MCC) 0.7326
ROC-AUC 0.9288
Loss 0.3381
Runtime (seconds) 0.5897
Samples per second 332.398
Steps per second 6.784

Interpretation

  • The ROC-AUC of 0.9288 indicates strong separation ability between offensive and non-offensive classes.
  • The MCC of 0.7326 shows robust classification performance even with slight class imbalance.
  • A recall of 0.8444 on the offensive class means the model is effective at capturing harmful or abusive language—important for forensic use cases.
  • The macro F1-score of 0.8662 confirms balanced performance across both classes.

Overall, the model demonstrates high discriminative power and reliable generalisation for Setswana offensive-language detection.

Limitations

  • Dataset is relatively small
  • Sensitive to spelling variations
  • May misclassify subtle forms of metaphorical or humorous abuse

Ethical Considerations

  • Contains sensitive language
  • Use in investigations must comply with local laws (Botswana Cybercrime Act, DPA)

How to Use

from transformers import RobertaTokenizer, RobertaForSequenceClassification

tokenizer = RobertaTokenizer.from_pretrained("mopatik/PuoBERTa-offensive-detection-v1")
model = RobertaForSequenceClassification.from_pretrained("mopatik/PuoBERTa-offensive-detection-v1")

# Ensure model is in evaluation mode
model.eval()

# Sample text (replace with your actual text)
#sample_text = "o seso tota"  # Example Setswana text
sample_text = "modimo a le segofatse"  # Example Setswana text

# Tokenize and prepare input
inputs = tokenizer(
    sample_text,
    padding='max_length',
    truncation=True,
    max_length=128,
    return_tensors="pt"
)

# Make prediction
with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    predicted_class = torch.argmax(probs).item()

# Get class label and confidence
class_names = ["Non-offensive", "Offensive"]
confidence = probs[0][predicted_class].item()

print(f"Text: {sample_text}")
print(f"Predicted class: {class_names[predicted_class]} (confidence: {confidence:.2%})")
print(f"Class probabilities: {dict(zip(class_names, [f'{p:.2%}' for p in probs[0].tolist()]))}")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mopatik/PuoBERTa-offensive-detection-v1

Base model

dsfsi/PuoBERTa
Finetuned
(1)
this model

Dataset used to train mopatik/PuoBERTa-offensive-detection-v1