HuggingFaceFW
/

finepdfs_edu_classifier_ron_Latn

Safetensors

Romanian

modernbert

Model card Files Files and versions

xet

Community

hynky commited on Oct 6, 2025

Commit

2040c21

verified ·

1 Parent(s): 2c1db85

Add model card for ron_Latn classifier

Browse files

Files changed (1) hide show

README.md +26 -9

README.md CHANGED Viewed

@@ -1,3 +1,4 @@
 ---
 language:
 - ro
@@ -82,7 +83,7 @@ print(max(scores))
 ```
 ## Training
-The classifier was trained on 0 pairs of web samples and their scores from 0 to 5, generated by Qwen3-235B-A22B-Instruct-2507. The samples were annotated based on their educational quality with 0 being not educational and 5 being highly educational.
 Below is the prompt used for Qwen3-235B-A22B-Instruct-2507 annotations:
 ```
@@ -117,29 +118,45 @@ After examining the extract:
 - Conclude with the score using the format: "Educational score: <total points>"\
 ```
-We added a classification head with a single regression output to mmbert-colab/mmBERT-base, unroze the last 4 layers and trained the model for 5000 epochs with a learning rate of 3e-4.
 **Training Details:**
-- Model: mmbert-colab/mmBERT-base with a classification head
-- Dataset: 0 samples from Llama3 annotations
-- Epochs: 1
 - Learning Rate: 3e-4
-- class distribution:
 - Evaluation Metric: F1 score
 **Classification report**
-We treat the regression model's predictions as discrete classes to calculate the metrics on a hold-out set of 0 Llama3-annotated samples.
 ```
 ```
 **Confusion matrix**
 We verify that the predicted educational scores are indeed close to their ground truth, and are mostry impacted by the noisy annotation.
 ```
 ```

 ---
 language:
 - ro
 ```
 ## Training
+The classifier was trained on 144960 pairs of web samples and their scores from 0 to 5, generated by Qwen3-235B-A22B-Instruct-2507. The samples were annotated based on their educational quality with 0 being not educational and 5 being highly educational.
 Below is the prompt used for Qwen3-235B-A22B-Instruct-2507 annotations:
 ```
 - Conclude with the score using the format: "Educational score: <total points>"\
 ```
+We added a classification head with a single regression output to [jhu-clsp/mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base), unroze the last 4 layers and trained the model for 5000 steps with a learning rate of 3e-4.
 **Training Details:**
+- Model: [jhu-clsp/mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base) with a classification head
+- Dataset: 144960 samples from Qwen3-235B-A22B-Instruct-2507 annotations
+- Steps: 5000
 - Learning Rate: 3e-4
+- class distribution: {0: 60400, 1: 60400, 2: 6040, 3: 6040, 4: 6040, 5: 6040}
 - Evaluation Metric: F1 score
 **Classification report**
+We treat the regression model's predictions as discrete classes to calculate the metrics on a hold-out set of 13929 Qwen3-235B-A22B-Instruct-2507-annotated samples.
 ```
+Validation Report:
+|   class |   precision |   recall |   f1-score |   support |
+|--------:|------------:|---------:|-----------:|----------:|
+|       0 |        0.9  |     0.74 |       0.81 |      8966 |
+|       1 |        0.61 |     0.81 |       0.69 |      4662 |
+|       2 |        0.28 |     0.39 |       0.32 |       183 |
+|       3 |        0.34 |     0.43 |       0.38 |        58 |
+|       4 |        0.6  |     0.61 |       0.61 |        54 |
+|       5 |        0.2  |     0.17 |       0.18 |         6 |
 ```
 **Confusion matrix**
 We verify that the predicted educational scores are indeed close to their ground truth, and are mostry impacted by the noisy annotation.
 ```
+Confusion Matrix:
+|   class  |    0 |    1 |   2 |   3 |   4 |   5 |
+|---------:|-----:|-----:|----:|----:|----:|----:|
+|        0 | 6625 | 2339 |   2 |   0 |   0 |   0 |
+|        1 |  732 | 3754 | 160 |  13 |   3 |   0 |
+|        2 |    0 |   82 |  71 |  25 |   5 |   0 |
+|        3 |    0 |    5 |  17 |  25 |  11 |   0 |
+|        4 |    0 |    3 |   4 |  10 |  33 |   4 |
+|        5 |    0 |    0 |   1 |   1 |   3 |   1 |
 ```