Add model card for ces_Latn classifier
Browse files
README.md
CHANGED
|
@@ -1,3 +1,4 @@
|
|
|
|
|
| 1 |
---
|
| 2 |
language:
|
| 3 |
- ce
|
|
@@ -82,7 +83,7 @@ print(max(scores))
|
|
| 82 |
```
|
| 83 |
|
| 84 |
## Training
|
| 85 |
-
The classifier was trained on
|
| 86 |
|
| 87 |
Below is the prompt used for Qwen3-235B-A22B-Instruct-2507 annotations:
|
| 88 |
```
|
|
@@ -117,29 +118,45 @@ After examining the extract:
|
|
| 117 |
- Conclude with the score using the format: "Educational score: <total points>"\
|
| 118 |
```
|
| 119 |
|
| 120 |
-
We added a classification head with a single regression output to
|
| 121 |
|
| 122 |
**Training Details:**
|
| 123 |
|
| 124 |
-
- Model:
|
| 125 |
-
- Dataset:
|
| 126 |
-
-
|
| 127 |
- Learning Rate: 3e-4
|
| 128 |
-
- class distribution:
|
| 129 |
- Evaluation Metric: F1 score
|
| 130 |
|
| 131 |
**Classification report**
|
| 132 |
|
| 133 |
-
We treat the regression model's predictions as discrete classes to calculate the metrics on a hold-out set of
|
| 134 |
```
|
| 135 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
```
|
| 137 |
|
| 138 |
**Confusion matrix**
|
| 139 |
|
| 140 |
We verify that the predicted educational scores are indeed close to their ground truth, and are mostry impacted by the noisy annotation.
|
| 141 |
```
|
| 142 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 143 |
```
|
| 144 |
|
| 145 |
|
|
|
|
| 1 |
+
|
| 2 |
---
|
| 3 |
language:
|
| 4 |
- ce
|
|
|
|
| 83 |
```
|
| 84 |
|
| 85 |
## Training
|
| 86 |
+
The classifier was trained on 285120 pairs of web samples and their scores from 0 to 5, generated by Qwen3-235B-A22B-Instruct-2507. The samples were annotated based on their educational quality with 0 being not educational and 5 being highly educational.
|
| 87 |
|
| 88 |
Below is the prompt used for Qwen3-235B-A22B-Instruct-2507 annotations:
|
| 89 |
```
|
|
|
|
| 118 |
- Conclude with the score using the format: "Educational score: <total points>"\
|
| 119 |
```
|
| 120 |
|
| 121 |
+
We added a classification head with a single regression output to [jhu-clsp/mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base), unroze the last 4 layers and trained the model for 5000 steps with a learning rate of 3e-4.
|
| 122 |
|
| 123 |
**Training Details:**
|
| 124 |
|
| 125 |
+
- Model: [jhu-clsp/mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base) with a classification head
|
| 126 |
+
- Dataset: 285120 samples from Qwen3-235B-A22B-Instruct-2507 annotations
|
| 127 |
+
- Steps: 5000
|
| 128 |
- Learning Rate: 3e-4
|
| 129 |
+
- class distribution: {0: 118800, 1: 118800, 2: 11880, 3: 11880, 4: 11880, 5: 11880}
|
| 130 |
- Evaluation Metric: F1 score
|
| 131 |
|
| 132 |
**Classification report**
|
| 133 |
|
| 134 |
+
We treat the regression model's predictions as discrete classes to calculate the metrics on a hold-out set of 13955 Qwen3-235B-A22B-Instruct-2507-annotated samples.
|
| 135 |
```
|
| 136 |
+
Validation Report:
|
| 137 |
+
| class | precision | recall | f1-score | support |
|
| 138 |
+
|--------:|------------:|---------:|-----------:|----------:|
|
| 139 |
+
| 0 | 0.8 | 0.8 | 0.8 | 6818 |
|
| 140 |
+
| 1 | 0.76 | 0.77 | 0.77 | 6526 |
|
| 141 |
+
| 2 | 0.37 | 0.33 | 0.35 | 369 |
|
| 142 |
+
| 3 | 0.31 | 0.41 | 0.35 | 126 |
|
| 143 |
+
| 4 | 0.61 | 0.53 | 0.57 | 104 |
|
| 144 |
+
| 5 | 0.5 | 0.5 | 0.5 | 12 |
|
| 145 |
```
|
| 146 |
|
| 147 |
**Confusion matrix**
|
| 148 |
|
| 149 |
We verify that the predicted educational scores are indeed close to their ground truth, and are mostry impacted by the noisy annotation.
|
| 150 |
```
|
| 151 |
+
Confusion Matrix:
|
| 152 |
+
| class | 0 | 1 | 2 | 3 | 4 | 5 |
|
| 153 |
+
|---------:|-----:|-----:|----:|----:|----:|----:|
|
| 154 |
+
| 0 | 5461 | 1355 | 2 | 0 | 0 | 0 |
|
| 155 |
+
| 1 | 1323 | 5017 | 155 | 29 | 2 | 0 |
|
| 156 |
+
| 2 | 0 | 184 | 120 | 57 | 8 | 0 |
|
| 157 |
+
| 3 | 0 | 16 | 38 | 52 | 20 | 0 |
|
| 158 |
+
| 4 | 0 | 3 | 9 | 31 | 55 | 6 |
|
| 159 |
+
| 5 | 0 | 0 | 0 | 1 | 5 | 6 |
|
| 160 |
```
|
| 161 |
|
| 162 |
|