Adding `safetensors` variant of this model

#2
Files changed (1) hide show
  1. README.md +12 -12
README.md CHANGED
@@ -2,7 +2,7 @@
2
  language:
3
  - ar
4
  license: apache-2.0
5
- base_model: thejosango/nuha-ajp-mlm
6
  tags:
7
  - bert
8
  - text-classification
@@ -12,20 +12,20 @@ tags:
12
  - binary-classification
13
  - pilot
14
  datasets:
15
- - thejosango/nuha-ajp-dataset
16
  metrics:
17
  - f1
18
  - precision
19
  - recall
20
  model-index:
21
- - name: nuha-ajp-binary
22
  results:
23
  - task:
24
  type: text-classification
25
  name: Text Classification
26
  dataset:
27
  name: Jordanian NUHA Dataset
28
- type: thejosango/nuha-ajp-dataset
29
  config: binary
30
  split: validation
31
  metrics:
@@ -40,11 +40,11 @@ model-index:
40
  name: Recall
41
  ---
42
 
43
- # nuha-ajp-binary
44
 
45
  ## Model Summary
46
 
47
- `nuha-ajp-binary` is a binary Arabic text classifier that detects hate speech in Jordanian social media comments. It fine-tunes [`nuha-ajp-mlm`](https://huggingface.co/thejosango/nuha-ajp-mlm) — a domain-adapted Arabic BERT — and outputs one of two labels:
48
 
49
  | Label | Meaning |
50
  |---|---|
@@ -53,7 +53,7 @@ model-index:
53
 
54
  This model was developed as part of a **pilot proof-of-concept** for the NUHA project by the [Jordan Open Source Association (JOSA)](https://josa.ngo). Performance metrics reflect the complexity of hate speech detection in colloquial Arabic and the exploratory nature of this initial effort.
55
 
56
- For a more granular three-class classifier, see [`nuha-ajp-trinary`](https://huggingface.co/thejosango/nuha-ajp-trinary).
57
 
58
  ## Uses
59
 
@@ -66,8 +66,8 @@ from transformers import pipeline
66
 
67
  classifier = pipeline(
68
  "text-classification",
69
- model="thejosango/nuha-ajp-binary",
70
- tokenizer="thejosango/nuha-ajp-binary",
71
  )
72
 
73
  result = classifier("أنتِ امرأة رائعة")
@@ -101,7 +101,7 @@ for comment, result in zip(comments, results):
101
 
102
  ### Training Data
103
 
104
- Fine-tuned on the `binary` configuration of [`thejosango/nuha-ajp-dataset`](https://huggingface.co/datasets/thejosango/nuha-ajp-dataset), which maps:
105
  - **Not Online Violence** → `non-hate-speech`
106
  - **Offensive Language** → `hate-speech`
107
  - **Gender Based Violence** → `hate-speech`
@@ -122,7 +122,7 @@ At training and inference time, the following normalisation is applied to input
122
 
123
  | Parameter | Value |
124
  |---|---|
125
- | Base model | thejosango/nuha-ajp-mlm |
126
  | Hidden layers | 4 (reduced from base's 12) |
127
  | Classifier dropout | 0.50 |
128
  | Learning rate | 5e-5 |
@@ -137,7 +137,7 @@ At training and inference time, the following normalisation is applied to input
137
 
138
  ### Evaluation Results
139
 
140
- Evaluated on the validation split of `thejosango/nuha-ajp-dataset` (binary configuration):
141
 
142
  | Metric | Value |
143
  |---|---|
 
2
  language:
3
  - ar
4
  license: apache-2.0
5
+ base_model: thejosango/nuha-jo-mlm
6
  tags:
7
  - bert
8
  - text-classification
 
12
  - binary-classification
13
  - pilot
14
  datasets:
15
+ - thejosango/nuha-dataset
16
  metrics:
17
  - f1
18
  - precision
19
  - recall
20
  model-index:
21
+ - name: nuha-jo-binary
22
  results:
23
  - task:
24
  type: text-classification
25
  name: Text Classification
26
  dataset:
27
  name: Jordanian NUHA Dataset
28
+ type: thejosango/nuha-dataset
29
  config: binary
30
  split: validation
31
  metrics:
 
40
  name: Recall
41
  ---
42
 
43
+ # nuha-jo-binary
44
 
45
  ## Model Summary
46
 
47
+ `nuha-jo-binary` is a binary Arabic text classifier that detects hate speech in Jordanian social media comments. It fine-tunes [`nuha-jo-mlm`](https://huggingface.co/thejosango/nuha-jo-mlm) — a domain-adapted Arabic BERT — and outputs one of two labels:
48
 
49
  | Label | Meaning |
50
  |---|---|
 
53
 
54
  This model was developed as part of a **pilot proof-of-concept** for the NUHA project by the [Jordan Open Source Association (JOSA)](https://josa.ngo). Performance metrics reflect the complexity of hate speech detection in colloquial Arabic and the exploratory nature of this initial effort.
55
 
56
+ For a more granular three-class classifier, see [`nuha-jo-trinary`](https://huggingface.co/thejosango/nuha-jo-trinary).
57
 
58
  ## Uses
59
 
 
66
 
67
  classifier = pipeline(
68
  "text-classification",
69
+ model="thejosango/nuha-jo-binary",
70
+ tokenizer="thejosango/nuha-jo-binary",
71
  )
72
 
73
  result = classifier("أنتِ امرأة رائعة")
 
101
 
102
  ### Training Data
103
 
104
+ Fine-tuned on the `binary` configuration of [`thejosango/nuha-dataset`](https://huggingface.co/datasets/thejosango/nuha-dataset), which maps:
105
  - **Not Online Violence** → `non-hate-speech`
106
  - **Offensive Language** → `hate-speech`
107
  - **Gender Based Violence** → `hate-speech`
 
122
 
123
  | Parameter | Value |
124
  |---|---|
125
+ | Base model | thejosango/nuha-jo-mlm |
126
  | Hidden layers | 4 (reduced from base's 12) |
127
  | Classifier dropout | 0.50 |
128
  | Learning rate | 5e-5 |
 
137
 
138
  ### Evaluation Results
139
 
140
+ Evaluated on the validation split of `thejosango/nuha-dataset` (binary configuration):
141
 
142
  | Metric | Value |
143
  |---|---|