msmaje
/

phdhatamodel

Text Classification

human-ai-text-attribution

african-languages

Model card Files Files and versions

msmaje commited on 8 days ago

Commit

f0a0ecb

·

verified ·

1 Parent(s): b5d8ae3

Add comprehensive model card

Files changed (1) hide show

README.md +84 -0

README.md ADDED Viewed

	@@ -0,0 +1,84 @@

+---
+language:
+- en
+- yo
+- ha
+- ig
+- sw
+- am
+- pcm
+license: apache-2.0
+base_model: davlan/afro-xlmr-base
+tags:
+- text-classification
+- human-ai-text-attribution
+- hata
+- african-languages
+- multilingual
+datasets:
+- msmaje/phd-hata-african-dataset
+metrics:
+- accuracy
+- f1
+---
+# AfroXLMR for Human-AI Text Attribution (HATA)
+This model is a fine-tuned version of [davlan/afro-xlmr-base](https://huggingface.co/davlan/afro-xlmr-base) for **Human-AI Text Attribution** in African languages.
+## Model Description
+- **Model Type:** Text Classification (Binary)
+- **Base Model:** AfroXLMR-base
+- **Languages:** Yoruba, Hausa, Igbo, Swahili, Amharic, Nigerian Pidgin, English
+- **Task:** Distinguishing between human-written and AI-generated text
+## Performance
+| Metric    | Score  |
+|-----------|--------|
+| Accuracy  | 1.0000 |
+| F1 Score  | 1.0000 |
+| Precision | 1.0000 |
+| Recall    | 1.0000 |
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+model_name = "msmaje/phdhatamodel"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+text = "Your text here"
+inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
+with torch.no_grad():
+    outputs = model(**inputs)
+    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
+    predicted_class = torch.argmax(predictions, dim=-1).item()
+labels = {0: "Human-written", 1: "AI-generated"}
+print(f"Prediction: {labels[predicted_class]}")
+```
+## Training Details
+- **Dataset:** msmaje/phd-hata-african-dataset
+- **Training samples:** 128,000
+- **Validation samples:** 32,000
+- **Epochs:** 3
+- **Learning Rate:** 2e-5
+- **Batch Size:** 16
+## Citation
+```bibtex
+@misc{msmaje2025hata,
+  author = {Maje, M.S.},
+  title = {AfroXLMR for Human-AI Text Attribution},
+  year = {2025},
+  publisher = {HuggingFace},
+  url = {https://huggingface.co/msmaje/phdhatamodel}
+}
+```