Add model card for unknown classifier
Browse files
README.md
CHANGED
|
@@ -1,3 +1,4 @@
|
|
|
|
|
| 1 |
---
|
| 2 |
language:
|
| 3 |
- un
|
|
@@ -82,7 +83,7 @@ print(max(scores))
|
|
| 82 |
```
|
| 83 |
|
| 84 |
## Training
|
| 85 |
-
The classifier was trained on
|
| 86 |
|
| 87 |
Below is the prompt used for Qwen3-235B-A22B-Instruct-2507 annotations:
|
| 88 |
```
|
|
@@ -117,29 +118,45 @@ After examining the extract:
|
|
| 117 |
- Conclude with the score using the format: "Educational score: <total points>"\
|
| 118 |
```
|
| 119 |
|
| 120 |
-
We added a classification head with a single regression output to
|
| 121 |
|
| 122 |
**Training Details:**
|
| 123 |
|
| 124 |
-
- Model:
|
| 125 |
-
- Dataset:
|
| 126 |
-
-
|
| 127 |
- Learning Rate: 3e-4
|
| 128 |
-
- class distribution:
|
| 129 |
- Evaluation Metric: F1 score
|
| 130 |
|
| 131 |
**Classification report**
|
| 132 |
|
| 133 |
-
We treat the regression model's predictions as discrete classes to calculate the metrics on a hold-out set of
|
| 134 |
```
|
| 135 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
```
|
| 137 |
|
| 138 |
**Confusion matrix**
|
| 139 |
|
| 140 |
We verify that the predicted educational scores are indeed close to their ground truth, and are mostry impacted by the noisy annotation.
|
| 141 |
```
|
| 142 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 143 |
```
|
| 144 |
|
| 145 |
|
|
|
|
| 1 |
+
|
| 2 |
---
|
| 3 |
language:
|
| 4 |
- un
|
|
|
|
| 83 |
```
|
| 84 |
|
| 85 |
## Training
|
| 86 |
+
The classifier was trained on 49996 pairs of web samples and their scores from 0 to 5, generated by Qwen3-235B-A22B-Instruct-2507. The samples were annotated based on their educational quality with 0 being not educational and 5 being highly educational.
|
| 87 |
|
| 88 |
Below is the prompt used for Qwen3-235B-A22B-Instruct-2507 annotations:
|
| 89 |
```
|
|
|
|
| 118 |
- Conclude with the score using the format: "Educational score: <total points>"\
|
| 119 |
```
|
| 120 |
|
| 121 |
+
We added a classification head with a single regression output to [jhu-clsp/mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base), unroze the last 4 layers and trained the model for 5000 steps with a learning rate of 3e-4.
|
| 122 |
|
| 123 |
**Training Details:**
|
| 124 |
|
| 125 |
+
- Model: [jhu-clsp/mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base) with a classification head
|
| 126 |
+
- Dataset: 49996 samples from Qwen3-235B-A22B-Instruct-2507 annotations
|
| 127 |
+
- Steps: 5000
|
| 128 |
- Learning Rate: 3e-4
|
| 129 |
+
- class distribution: {0: 20400, 1: 20400, 2: 3076, 3: 2040, 4: 2040, 5: 2040}
|
| 130 |
- Evaluation Metric: F1 score
|
| 131 |
|
| 132 |
**Classification report**
|
| 133 |
|
| 134 |
+
We treat the regression model's predictions as discrete classes to calculate the metrics on a hold-out set of 11783 Qwen3-235B-A22B-Instruct-2507-annotated samples.
|
| 135 |
```
|
| 136 |
+
Validation Report:
|
| 137 |
+
| class | precision | recall | f1-score | support |
|
| 138 |
+
|--------:|------------:|---------:|-----------:|----------:|
|
| 139 |
+
| 0 | 0.78 | 0.9 | 0.83 | 7410 |
|
| 140 |
+
| 1 | 0.72 | 0.5 | 0.6 | 4137 |
|
| 141 |
+
| 2 | 0.21 | 0.37 | 0.26 | 123 |
|
| 142 |
+
| 3 | 0.25 | 0.36 | 0.3 | 58 |
|
| 143 |
+
| 4 | 0.66 | 0.62 | 0.64 | 53 |
|
| 144 |
+
| 5 | 0 | 0 | 0 | 2 |
|
| 145 |
```
|
| 146 |
|
| 147 |
**Confusion matrix**
|
| 148 |
|
| 149 |
We verify that the predicted educational scores are indeed close to their ground truth, and are mostry impacted by the noisy annotation.
|
| 150 |
```
|
| 151 |
+
Confusion Matrix:
|
| 152 |
+
| class | 0 | 1 | 2 | 3 | 4 | 5 |
|
| 153 |
+
|---------:|-----:|-----:|----:|----:|----:|----:|
|
| 154 |
+
| 0 | 6651 | 728 | 29 | 2 | 0 | 0 |
|
| 155 |
+
| 1 | 1892 | 2088 | 121 | 35 | 1 | 0 |
|
| 156 |
+
| 2 | 6 | 54 | 45 | 15 | 3 | 0 |
|
| 157 |
+
| 3 | 2 | 10 | 14 | 21 | 11 | 0 |
|
| 158 |
+
| 4 | 0 | 1 | 8 | 11 | 33 | 0 |
|
| 159 |
+
| 5 | 0 | 0 | 0 | 0 | 2 | 0 |
|
| 160 |
```
|
| 161 |
|
| 162 |
|