HuggingFaceFW
/

finepdfs_edu_classifier_cmn_Hani

Safetensors

Chinese

modernbert

Model card Files Files and versions

xet

Community

hynky commited on Oct 6, 2025

Commit

b83c6d4

verified ·

1 Parent(s): db4c2a1

Add model card for cmn_Hani classifier

Browse files

Files changed (1) hide show

README.md +26 -9

README.md CHANGED Viewed

@@ -1,3 +1,4 @@
 ---
 language:
 - zh
@@ -82,7 +83,7 @@ print(max(scores))
 ```
 ## Training
-The classifier was trained on 0 pairs of web samples and their scores from 0 to 5, generated by Qwen3-235B-A22B-Instruct-2507. The samples were annotated based on their educational quality with 0 being not educational and 5 being highly educational.
 Below is the prompt used for Qwen3-235B-A22B-Instruct-2507 annotations:
 ```
@@ -117,29 +118,45 @@ After examining the extract:
 - Conclude with the score using the format: "Educational score: <total points>"\
 ```
-We added a classification head with a single regression output to mmbert-colab/mmBERT-base, unroze the last 4 layers and trained the model for 5000 epochs with a learning rate of 3e-4.
 **Training Details:**
-- Model: mmbert-colab/mmBERT-base with a classification head
-- Dataset: 0 samples from Llama3 annotations
-- Epochs: 1
 - Learning Rate: 3e-4
-- class distribution:
 - Evaluation Metric: F1 score
 **Classification report**
-We treat the regression model's predictions as discrete classes to calculate the metrics on a hold-out set of 0 Llama3-annotated samples.
 ```
 ```
 **Confusion matrix**
 We verify that the predicted educational scores are indeed close to their ground truth, and are mostry impacted by the noisy annotation.
 ```
 ```

 ---
 language:
 - zh
 ```
 ## Training
+The classifier was trained on 387601 pairs of web samples and their scores from 0 to 5, generated by Qwen3-235B-A22B-Instruct-2507. The samples were annotated based on their educational quality with 0 being not educational and 5 being highly educational.
 Below is the prompt used for Qwen3-235B-A22B-Instruct-2507 annotations:
 ```
 - Conclude with the score using the format: "Educational score: <total points>"\
 ```
+We added a classification head with a single regression output to [jhu-clsp/mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base), unroze the last 4 layers and trained the model for 5000 steps with a learning rate of 3e-4.
 **Training Details:**
+- Model: [jhu-clsp/mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base) with a classification head
+- Dataset: 387601 samples from Qwen3-235B-A22B-Instruct-2507 annotations
+- Steps: 5000
 - Learning Rate: 3e-4
+- class distribution: {0: 133122, 1: 175919, 2: 19640, 3: 19640, 4: 19640, 5: 19640}
 - Evaluation Metric: F1 score
 **Classification report**
+We treat the regression model's predictions as discrete classes to calculate the metrics on a hold-out set of 13020 Qwen3-235B-A22B-Instruct-2507-annotated samples.
 ```
+Validation Report:
+|   class |   precision |   recall |   f1-score |   support |
+|--------:|------------:|---------:|-----------:|----------:|
+|       0 |        0.74 |     0.86 |       0.8  |      5325 |
+|       1 |        0.85 |     0.73 |       0.79 |      7037 |
+|       2 |        0.37 |     0.43 |       0.39 |       386 |
+|       3 |        0.28 |     0.39 |       0.33 |       132 |
+|       4 |        0.59 |     0.45 |       0.51 |       121 |
+|       5 |        0.62 |     0.42 |       0.5  |        19 |
 ```
 **Confusion matrix**
 We verify that the predicted educational scores are indeed close to their ground truth, and are mostry impacted by the noisy annotation.
 ```
+Confusion Matrix:
+|   class  |    0 |    1 |   2 |   3 |   4 |   5 |
+|---------:|-----:|-----:|----:|----:|----:|----:|
+|        0 | 4583 |  740 |   2 |   0 |   0 |   0 |
+|        1 | 1620 | 5158 | 224 |  32 |   2 |   1 |
+|        2 |    1 |  159 | 165 |  50 |  11 |   0 |
+|        3 |    0 |   15 |  50 |  52 |  15 |   0 |
+|        4 |    0 |    4 |  10 |  49 |  54 |   4 |
+|        5 |    0 |    0 |   1 |   1 |   9 |   8 |
 ```