msmaje commited on
Commit
f0a0ecb
·
verified ·
1 Parent(s): b5d8ae3

Add comprehensive model card

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - yo
5
+ - ha
6
+ - ig
7
+ - sw
8
+ - am
9
+ - pcm
10
+ license: apache-2.0
11
+ base_model: davlan/afro-xlmr-base
12
+ tags:
13
+ - text-classification
14
+ - human-ai-text-attribution
15
+ - hata
16
+ - african-languages
17
+ - multilingual
18
+ datasets:
19
+ - msmaje/phd-hata-african-dataset
20
+ metrics:
21
+ - accuracy
22
+ - f1
23
+ ---
24
+
25
+ # AfroXLMR for Human-AI Text Attribution (HATA)
26
+
27
+ This model is a fine-tuned version of [davlan/afro-xlmr-base](https://huggingface.co/davlan/afro-xlmr-base) for **Human-AI Text Attribution** in African languages.
28
+
29
+ ## Model Description
30
+
31
+ - **Model Type:** Text Classification (Binary)
32
+ - **Base Model:** AfroXLMR-base
33
+ - **Languages:** Yoruba, Hausa, Igbo, Swahili, Amharic, Nigerian Pidgin, English
34
+ - **Task:** Distinguishing between human-written and AI-generated text
35
+
36
+ ## Performance
37
+
38
+ | Metric | Score |
39
+ |-----------|--------|
40
+ | Accuracy | 1.0000 |
41
+ | F1 Score | 1.0000 |
42
+ | Precision | 1.0000 |
43
+ | Recall | 1.0000 |
44
+
45
+ ## Usage
46
+ ```python
47
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
48
+ import torch
49
+
50
+ model_name = "msmaje/phdhatamodel"
51
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
52
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
53
+
54
+ text = "Your text here"
55
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
56
+
57
+ with torch.no_grad():
58
+ outputs = model(**inputs)
59
+ predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
60
+ predicted_class = torch.argmax(predictions, dim=-1).item()
61
+
62
+ labels = {0: "Human-written", 1: "AI-generated"}
63
+ print(f"Prediction: {labels[predicted_class]}")
64
+ ```
65
+
66
+ ## Training Details
67
+
68
+ - **Dataset:** msmaje/phd-hata-african-dataset
69
+ - **Training samples:** 128,000
70
+ - **Validation samples:** 32,000
71
+ - **Epochs:** 3
72
+ - **Learning Rate:** 2e-5
73
+ - **Batch Size:** 16
74
+
75
+ ## Citation
76
+ ```bibtex
77
+ @misc{msmaje2025hata,
78
+ author = {Maje, M.S.},
79
+ title = {AfroXLMR for Human-AI Text Attribution},
80
+ year = {2025},
81
+ publisher = {HuggingFace},
82
+ url = {https://huggingface.co/msmaje/phdhatamodel}
83
+ }
84
+ ```