lgsilvaesilva commited on
Commit
e6caab3
·
verified ·
1 Parent(s): 4c67f07

Upload folder using huggingface_hub

Browse files
Files changed (50) hide show
  1. .gitattributes +1 -0
  2. README.md +65 -65
  3. REPORT.md +65 -65
  4. baselines/embedding-lightgbm/embedding-lightgbm.joblib +2 -2
  5. baselines/embedding-lightgbm/test_predictions.csv +0 -0
  6. baselines/embedding-lightgbm/validation_predictions.csv +0 -0
  7. baselines/embedding-logistic/embedding-logistic.joblib +2 -2
  8. baselines/embedding-logistic/test_predictions.csv +0 -0
  9. baselines/embedding-logistic/validation_predictions.csv +0 -0
  10. baselines/embedding-svm/embedding-svm.joblib +1 -1
  11. baselines/embedding-svm/test_predictions.csv +0 -0
  12. baselines/embedding-svm/validation_predictions.csv +0 -0
  13. baselines/logistic/logistic_tfidf.joblib +2 -2
  14. baselines/logistic/test_predictions.csv +0 -0
  15. baselines/logistic/validation_predictions.csv +0 -0
  16. baselines/xgboost/test_predictions.csv +0 -0
  17. baselines/xgboost/validation_predictions.csv +0 -0
  18. baselines/xgboost/xgboost_tfidf.joblib +2 -2
  19. report.json +701 -701
  20. transformer/checkpoint-1220/config.json +1 -1
  21. transformer/checkpoint-1220/model.safetensors +1 -1
  22. transformer/checkpoint-1220/optimizer.pt +1 -1
  23. transformer/checkpoint-1220/rng_state.pth +1 -1
  24. transformer/checkpoint-1220/scaler.pt +1 -1
  25. transformer/checkpoint-1220/scheduler.pt +1 -1
  26. transformer/checkpoint-1220/trainer_state.json +218 -244
  27. transformer/checkpoint-1220/training_args.bin +2 -2
  28. transformer/checkpoint-1830/config.json +39 -0
  29. transformer/checkpoint-1830/model.safetensors +3 -0
  30. transformer/checkpoint-1830/optimizer.pt +3 -0
  31. transformer/checkpoint-1830/rng_state.pth +3 -0
  32. transformer/checkpoint-1830/scaler.pt +3 -0
  33. transformer/checkpoint-1830/scheduler.pt +3 -0
  34. transformer/checkpoint-1830/tokenizer.json +3 -0
  35. transformer/checkpoint-1830/tokenizer_config.json +15 -0
  36. transformer/checkpoint-1830/trainer_state.json +593 -0
  37. transformer/checkpoint-1830/training_args.bin +3 -0
  38. transformer/checkpoint-610/config.json +1 -1
  39. transformer/checkpoint-610/model.safetensors +1 -1
  40. transformer/checkpoint-610/optimizer.pt +1 -1
  41. transformer/checkpoint-610/rng_state.pth +1 -1
  42. transformer/checkpoint-610/scaler.pt +1 -1
  43. transformer/checkpoint-610/scheduler.pt +1 -1
  44. transformer/checkpoint-610/trainer_state.json +113 -126
  45. transformer/checkpoint-610/training_args.bin +2 -2
  46. transformer/config.json +6 -6
  47. transformer/model.safetensors +1 -1
  48. transformer/test_predictions.csv +0 -0
  49. transformer/training_args.bin +2 -2
  50. transformer/validation_predictions.csv +0 -0
.gitattributes CHANGED
@@ -39,3 +39,4 @@ transformer/checkpoint-305/tokenizer.json filter=lfs diff=lfs merge=lfs -text
39
  transformer/checkpoint-610/tokenizer.json filter=lfs diff=lfs merge=lfs -text
40
  transformer/checkpoint-915/tokenizer.json filter=lfs diff=lfs merge=lfs -text
41
  transformer/tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
39
  transformer/checkpoint-610/tokenizer.json filter=lfs diff=lfs merge=lfs -text
40
  transformer/checkpoint-915/tokenizer.json filter=lfs diff=lfs merge=lfs -text
41
  transformer/tokenizer.json filter=lfs diff=lfs merge=lfs -text
42
+ transformer/checkpoint-1830/tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -17,19 +17,19 @@ It includes the Transformer model, any configured TF-IDF or sentence-embedding b
17
 
18
  - Dataset: `faodl/amis-agri-utilization`
19
  - Dataset subset: ``
20
- - Dataset revision: `ada4a04088a98f8f64bc7485c57d4c7f422c2151`
21
  - Text column: `chunk_text`
22
  - Label column: `label`
23
  - Transformer: `FacebookAI/xlm-roberta-base`
24
- - Generated at: `2026-05-27T10:50:45.867038+00:00`
25
 
26
  ## Dataset Summary
27
 
28
  | Split | Rows | Label 0 | Label 1 | Unique groups | Mean text length |
29
  | --- | ---: | ---: | ---: | ---: | ---: |
30
- | train | 4877 | 4347 | 530 | 2513 | 696.6 |
31
- | validation | 978 | 899 | 79 | 538 | 690.6 |
32
- | test | 1016 | 904 | 112 | 539 | 690.7 |
33
 
34
  ## Threshold Comparison on Validation Split
35
 
@@ -37,35 +37,35 @@ Validation metrics document threshold selection and tuning behavior; test metric
37
 
38
  | Model | Threshold | Accuracy | Precision | Recall | F1 | ROC AUC | Average precision |
39
  | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
40
- | logistic_tfidf | 0.500 | 0.912 | 0.465 | 0.582 | 0.517 | 0.872 | 0.594 |
41
- | logistic_tfidf | 0.608 | 0.942 | 0.696 | 0.494 | 0.578 | 0.872 | 0.594 |
42
- | xgboost_tfidf | 0.500 | 0.945 | 0.931 | 0.342 | 0.500 | 0.823 | 0.588 |
43
- | xgboost_tfidf | 0.177 | 0.934 | 0.592 | 0.570 | 0.581 | 0.823 | 0.588 |
44
- | embedding-logistic_sentence_embeddings | 0.500 | 0.912 | 0.476 | 0.861 | 0.613 | 0.953 | 0.762 |
45
- | embedding-logistic_sentence_embeddings | 0.722 | 0.957 | 0.703 | 0.810 | 0.753 | 0.953 | 0.762 |
46
- | embedding-svm_sentence_embeddings | 0.500 | 0.955 | 0.807 | 0.582 | 0.676 | 0.952 | 0.754 |
47
- | embedding-svm_sentence_embeddings | 0.310 | 0.957 | 0.713 | 0.785 | 0.747 | 0.952 | 0.754 |
48
- | embedding-lightgbm_sentence_embeddings | 0.500 | 0.954 | 0.750 | 0.646 | 0.694 | 0.948 | 0.782 |
49
- | embedding-lightgbm_sentence_embeddings | 0.042 | 0.952 | 0.670 | 0.797 | 0.728 | 0.948 | 0.782 |
50
- | transformer | 0.500 | 0.970 | 0.798 | 0.848 | 0.822 | 0.966 | 0.854 |
51
- | transformer | 0.471 | 0.971 | 0.800 | 0.861 | 0.829 | 0.966 | 0.854 |
52
 
53
  ## Threshold Comparison on Test Split
54
 
55
  | Model | Threshold | Accuracy | Precision | Recall | F1 | ROC AUC | Average precision |
56
  | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
57
- | logistic_tfidf | 0.500 | 0.926 | 0.691 | 0.598 | 0.641 | 0.899 | 0.726 |
58
- | logistic_tfidf | 0.608 | 0.930 | 0.902 | 0.411 | 0.564 | 0.899 | 0.726 |
59
- | xgboost_tfidf | 0.500 | 0.924 | 1.000 | 0.312 | 0.476 | 0.892 | 0.692 |
60
- | xgboost_tfidf | 0.177 | 0.918 | 0.663 | 0.527 | 0.587 | 0.892 | 0.692 |
61
- | embedding-logistic_sentence_embeddings | 0.500 | 0.891 | 0.503 | 0.884 | 0.641 | 0.955 | 0.710 |
62
- | embedding-logistic_sentence_embeddings | 0.722 | 0.935 | 0.689 | 0.750 | 0.718 | 0.955 | 0.710 |
63
- | embedding-svm_sentence_embeddings | 0.500 | 0.930 | 0.741 | 0.562 | 0.640 | 0.956 | 0.704 |
64
- | embedding-svm_sentence_embeddings | 0.310 | 0.934 | 0.686 | 0.741 | 0.712 | 0.956 | 0.704 |
65
- | embedding-lightgbm_sentence_embeddings | 0.500 | 0.937 | 0.740 | 0.661 | 0.698 | 0.960 | 0.791 |
66
- | embedding-lightgbm_sentence_embeddings | 0.042 | 0.929 | 0.639 | 0.821 | 0.719 | 0.960 | 0.791 |
67
- | transformer | 0.500 | 0.951 | 0.777 | 0.777 | 0.777 | 0.968 | 0.817 |
68
- | transformer | 0.471 | 0.950 | 0.770 | 0.777 | 0.773 | 0.968 | 0.817 |
69
 
70
  ## Confusion Matrices on Test Split
71
 
@@ -75,95 +75,95 @@ Rows are true labels and columns are predicted labels.
75
 
76
  | True / Predicted | NOT_RELEVANT | RELEVANT |
77
  | --- | ---: | ---: |
78
- | NOT_RELEVANT | 874 | 30 |
79
- | RELEVANT | 45 | 67 |
80
 
81
- ### logistic_tfidf at threshold 0.608
82
 
83
  | True / Predicted | NOT_RELEVANT | RELEVANT |
84
  | --- | ---: | ---: |
85
- | NOT_RELEVANT | 899 | 5 |
86
- | RELEVANT | 66 | 46 |
87
 
88
  ### xgboost_tfidf at threshold 0.500
89
 
90
  | True / Predicted | NOT_RELEVANT | RELEVANT |
91
  | --- | ---: | ---: |
92
- | NOT_RELEVANT | 904 | 0 |
93
- | RELEVANT | 77 | 35 |
94
 
95
- ### xgboost_tfidf at threshold 0.177
96
 
97
  | True / Predicted | NOT_RELEVANT | RELEVANT |
98
  | --- | ---: | ---: |
99
- | NOT_RELEVANT | 874 | 30 |
100
- | RELEVANT | 53 | 59 |
101
 
102
  ### embedding-logistic_sentence_embeddings at threshold 0.500
103
 
104
  | True / Predicted | NOT_RELEVANT | RELEVANT |
105
  | --- | ---: | ---: |
106
- | NOT_RELEVANT | 806 | 98 |
107
- | RELEVANT | 13 | 99 |
108
 
109
- ### embedding-logistic_sentence_embeddings at threshold 0.722
110
 
111
  | True / Predicted | NOT_RELEVANT | RELEVANT |
112
  | --- | ---: | ---: |
113
- | NOT_RELEVANT | 866 | 38 |
114
- | RELEVANT | 28 | 84 |
115
 
116
  ### embedding-svm_sentence_embeddings at threshold 0.500
117
 
118
  | True / Predicted | NOT_RELEVANT | RELEVANT |
119
  | --- | ---: | ---: |
120
- | NOT_RELEVANT | 882 | 22 |
121
- | RELEVANT | 49 | 63 |
122
 
123
- ### embedding-svm_sentence_embeddings at threshold 0.310
124
 
125
  | True / Predicted | NOT_RELEVANT | RELEVANT |
126
  | --- | ---: | ---: |
127
- | NOT_RELEVANT | 866 | 38 |
128
- | RELEVANT | 29 | 83 |
129
 
130
  ### embedding-lightgbm_sentence_embeddings at threshold 0.500
131
 
132
  | True / Predicted | NOT_RELEVANT | RELEVANT |
133
  | --- | ---: | ---: |
134
- | NOT_RELEVANT | 878 | 26 |
135
- | RELEVANT | 38 | 74 |
136
 
137
- ### embedding-lightgbm_sentence_embeddings at threshold 0.042
138
 
139
  | True / Predicted | NOT_RELEVANT | RELEVANT |
140
  | --- | ---: | ---: |
141
- | NOT_RELEVANT | 852 | 52 |
142
- | RELEVANT | 20 | 92 |
143
 
144
  ### transformer at threshold 0.500
145
 
146
  | True / Predicted | NOT_RELEVANT | RELEVANT |
147
  | --- | ---: | ---: |
148
- | NOT_RELEVANT | 879 | 25 |
149
- | RELEVANT | 25 | 87 |
150
 
151
- ### transformer at threshold 0.471
152
 
153
  | True / Predicted | NOT_RELEVANT | RELEVANT |
154
  | --- | ---: | ---: |
155
- | NOT_RELEVANT | 878 | 26 |
156
- | RELEVANT | 25 | 87 |
157
 
158
 
159
  ## Validation-Tuned Thresholds
160
 
161
- - `logistic_tfidf`: threshold `0.608` (validation F1 `0.578`); test F1 change vs 0.5: `-0.077`.
162
- - `xgboost_tfidf`: threshold `0.177` (validation F1 `0.581`); test F1 change vs 0.5: `+0.111`.
163
- - `embedding-logistic_sentence_embeddings`: threshold `0.722` (validation F1 `0.753`); test F1 change vs 0.5: `+0.077`.
164
- - `embedding-svm_sentence_embeddings`: threshold `0.310` (validation F1 `0.747`); test F1 change vs 0.5: `+0.073`.
165
- - `embedding-lightgbm_sentence_embeddings`: threshold `0.042` (validation F1 `0.728`); test F1 change vs 0.5: `+0.021`.
166
- - `transformer`: threshold `0.471` (validation F1 `0.829`); test F1 change vs 0.5: `-0.003`.
167
 
168
  ## Artifacts
169
 
 
17
 
18
  - Dataset: `faodl/amis-agri-utilization`
19
  - Dataset subset: ``
20
+ - Dataset revision: `main`
21
  - Text column: `chunk_text`
22
  - Label column: `label`
23
  - Transformer: `FacebookAI/xlm-roberta-base`
24
+ - Generated at: `2026-06-09T23:58:45.600559+00:00`
25
 
26
  ## Dataset Summary
27
 
28
  | Split | Rows | Label 0 | Label 1 | Unique groups | Mean text length |
29
  | --- | ---: | ---: | ---: | ---: | ---: |
30
+ | train | 9753 | 8950 | 803 | 4987 | 696.4 |
31
+ | validation | 2084 | 1885 | 199 | 1069 | 700.8 |
32
+ | test | 2086 | 1957 | 129 | 1069 | 701.6 |
33
 
34
  ## Threshold Comparison on Validation Split
35
 
 
37
 
38
  | Model | Threshold | Accuracy | Precision | Recall | F1 | ROC AUC | Average precision |
39
  | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
40
+ | logistic_tfidf | 0.500 | 0.901 | 0.482 | 0.462 | 0.472 | 0.867 | 0.496 |
41
+ | logistic_tfidf | 0.360 | 0.863 | 0.380 | 0.688 | 0.489 | 0.867 | 0.496 |
42
+ | xgboost_tfidf | 0.500 | 0.919 | 0.721 | 0.246 | 0.367 | 0.834 | 0.493 |
43
+ | xgboost_tfidf | 0.104 | 0.903 | 0.492 | 0.588 | 0.535 | 0.834 | 0.493 |
44
+ | embedding-logistic_sentence_embeddings | 0.500 | 0.895 | 0.474 | 0.869 | 0.613 | 0.952 | 0.652 |
45
+ | embedding-logistic_sentence_embeddings | 0.726 | 0.930 | 0.602 | 0.804 | 0.688 | 0.952 | 0.652 |
46
+ | embedding-svm_sentence_embeddings | 0.500 | 0.931 | 0.712 | 0.472 | 0.568 | 0.954 | 0.670 |
47
+ | embedding-svm_sentence_embeddings | 0.245 | 0.938 | 0.647 | 0.764 | 0.700 | 0.954 | 0.670 |
48
+ | embedding-lightgbm_sentence_embeddings | 0.500 | 0.937 | 0.681 | 0.633 | 0.656 | 0.954 | 0.669 |
49
+ | embedding-lightgbm_sentence_embeddings | 0.089 | 0.933 | 0.610 | 0.824 | 0.701 | 0.954 | 0.669 |
50
+ | transformer | 0.500 | 0.938 | 0.653 | 0.739 | 0.693 | 0.954 | 0.726 |
51
+ | transformer | 0.544 | 0.939 | 0.662 | 0.739 | 0.698 | 0.954 | 0.726 |
52
 
53
  ## Threshold Comparison on Test Split
54
 
55
  | Model | Threshold | Accuracy | Precision | Recall | F1 | ROC AUC | Average precision |
56
  | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
57
+ | logistic_tfidf | 0.500 | 0.918 | 0.358 | 0.419 | 0.386 | 0.856 | 0.398 |
58
+ | logistic_tfidf | 0.360 | 0.869 | 0.267 | 0.643 | 0.377 | 0.856 | 0.398 |
59
+ | xgboost_tfidf | 0.500 | 0.950 | 0.766 | 0.279 | 0.409 | 0.821 | 0.471 |
60
+ | xgboost_tfidf | 0.104 | 0.907 | 0.343 | 0.558 | 0.425 | 0.821 | 0.471 |
61
+ | embedding-logistic_sentence_embeddings | 0.500 | 0.891 | 0.350 | 0.884 | 0.501 | 0.951 | 0.543 |
62
+ | embedding-logistic_sentence_embeddings | 0.726 | 0.929 | 0.449 | 0.690 | 0.544 | 0.951 | 0.543 |
63
+ | embedding-svm_sentence_embeddings | 0.500 | 0.948 | 0.606 | 0.465 | 0.526 | 0.955 | 0.566 |
64
+ | embedding-svm_sentence_embeddings | 0.245 | 0.937 | 0.494 | 0.674 | 0.570 | 0.955 | 0.566 |
65
+ | embedding-lightgbm_sentence_embeddings | 0.500 | 0.948 | 0.579 | 0.597 | 0.588 | 0.948 | 0.585 |
66
+ | embedding-lightgbm_sentence_embeddings | 0.089 | 0.932 | 0.472 | 0.775 | 0.587 | 0.948 | 0.585 |
67
+ | transformer | 0.500 | 0.943 | 0.532 | 0.643 | 0.582 | 0.931 | 0.500 |
68
+ | transformer | 0.544 | 0.942 | 0.529 | 0.636 | 0.577 | 0.931 | 0.500 |
69
 
70
  ## Confusion Matrices on Test Split
71
 
 
75
 
76
  | True / Predicted | NOT_RELEVANT | RELEVANT |
77
  | --- | ---: | ---: |
78
+ | NOT_RELEVANT | 1860 | 97 |
79
+ | RELEVANT | 75 | 54 |
80
 
81
+ ### logistic_tfidf at threshold 0.360
82
 
83
  | True / Predicted | NOT_RELEVANT | RELEVANT |
84
  | --- | ---: | ---: |
85
+ | NOT_RELEVANT | 1729 | 228 |
86
+ | RELEVANT | 46 | 83 |
87
 
88
  ### xgboost_tfidf at threshold 0.500
89
 
90
  | True / Predicted | NOT_RELEVANT | RELEVANT |
91
  | --- | ---: | ---: |
92
+ | NOT_RELEVANT | 1946 | 11 |
93
+ | RELEVANT | 93 | 36 |
94
 
95
+ ### xgboost_tfidf at threshold 0.104
96
 
97
  | True / Predicted | NOT_RELEVANT | RELEVANT |
98
  | --- | ---: | ---: |
99
+ | NOT_RELEVANT | 1819 | 138 |
100
+ | RELEVANT | 57 | 72 |
101
 
102
  ### embedding-logistic_sentence_embeddings at threshold 0.500
103
 
104
  | True / Predicted | NOT_RELEVANT | RELEVANT |
105
  | --- | ---: | ---: |
106
+ | NOT_RELEVANT | 1745 | 212 |
107
+ | RELEVANT | 15 | 114 |
108
 
109
+ ### embedding-logistic_sentence_embeddings at threshold 0.726
110
 
111
  | True / Predicted | NOT_RELEVANT | RELEVANT |
112
  | --- | ---: | ---: |
113
+ | NOT_RELEVANT | 1848 | 109 |
114
+ | RELEVANT | 40 | 89 |
115
 
116
  ### embedding-svm_sentence_embeddings at threshold 0.500
117
 
118
  | True / Predicted | NOT_RELEVANT | RELEVANT |
119
  | --- | ---: | ---: |
120
+ | NOT_RELEVANT | 1918 | 39 |
121
+ | RELEVANT | 69 | 60 |
122
 
123
+ ### embedding-svm_sentence_embeddings at threshold 0.245
124
 
125
  | True / Predicted | NOT_RELEVANT | RELEVANT |
126
  | --- | ---: | ---: |
127
+ | NOT_RELEVANT | 1868 | 89 |
128
+ | RELEVANT | 42 | 87 |
129
 
130
  ### embedding-lightgbm_sentence_embeddings at threshold 0.500
131
 
132
  | True / Predicted | NOT_RELEVANT | RELEVANT |
133
  | --- | ---: | ---: |
134
+ | NOT_RELEVANT | 1901 | 56 |
135
+ | RELEVANT | 52 | 77 |
136
 
137
+ ### embedding-lightgbm_sentence_embeddings at threshold 0.089
138
 
139
  | True / Predicted | NOT_RELEVANT | RELEVANT |
140
  | --- | ---: | ---: |
141
+ | NOT_RELEVANT | 1845 | 112 |
142
+ | RELEVANT | 29 | 100 |
143
 
144
  ### transformer at threshold 0.500
145
 
146
  | True / Predicted | NOT_RELEVANT | RELEVANT |
147
  | --- | ---: | ---: |
148
+ | NOT_RELEVANT | 1884 | 73 |
149
+ | RELEVANT | 46 | 83 |
150
 
151
+ ### transformer at threshold 0.544
152
 
153
  | True / Predicted | NOT_RELEVANT | RELEVANT |
154
  | --- | ---: | ---: |
155
+ | NOT_RELEVANT | 1884 | 73 |
156
+ | RELEVANT | 47 | 82 |
157
 
158
 
159
  ## Validation-Tuned Thresholds
160
 
161
+ - `logistic_tfidf`: threshold `0.360` (validation F1 `0.489`); test F1 change vs 0.5: `-0.008`.
162
+ - `xgboost_tfidf`: threshold `0.104` (validation F1 `0.535`); test F1 change vs 0.5: `+0.016`.
163
+ - `embedding-logistic_sentence_embeddings`: threshold `0.726` (validation F1 `0.688`); test F1 change vs 0.5: `+0.043`.
164
+ - `embedding-svm_sentence_embeddings`: threshold `0.245` (validation F1 `0.700`); test F1 change vs 0.5: `+0.044`.
165
+ - `embedding-lightgbm_sentence_embeddings`: threshold `0.089` (validation F1 `0.701`); test F1 change vs 0.5: `-0.001`.
166
+ - `transformer`: threshold `0.544` (validation F1 `0.698`); test F1 change vs 0.5: `-0.005`.
167
 
168
  ## Artifacts
169
 
REPORT.md CHANGED
@@ -2,19 +2,19 @@
2
 
3
  - Dataset: `faodl/amis-agri-utilization`
4
  - Dataset subset: ``
5
- - Dataset revision: `ada4a04088a98f8f64bc7485c57d4c7f422c2151`
6
  - Text column: `chunk_text`
7
  - Label column: `label`
8
  - Transformer: `FacebookAI/xlm-roberta-base`
9
- - Generated at: `2026-05-27T10:50:45.867038+00:00`
10
 
11
  ## Dataset Summary
12
 
13
  | Split | Rows | Label 0 | Label 1 | Unique groups | Mean text length |
14
  | --- | ---: | ---: | ---: | ---: | ---: |
15
- | train | 4877 | 4347 | 530 | 2513 | 696.6 |
16
- | validation | 978 | 899 | 79 | 538 | 690.6 |
17
- | test | 1016 | 904 | 112 | 539 | 690.7 |
18
 
19
  ## Threshold Comparison on Validation Split
20
 
@@ -22,35 +22,35 @@ Validation metrics document threshold selection and tuning behavior; test metric
22
 
23
  | Model | Threshold | Accuracy | Precision | Recall | F1 | ROC AUC | Average precision |
24
  | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
25
- | logistic_tfidf | 0.500 | 0.912 | 0.465 | 0.582 | 0.517 | 0.872 | 0.594 |
26
- | logistic_tfidf | 0.608 | 0.942 | 0.696 | 0.494 | 0.578 | 0.872 | 0.594 |
27
- | xgboost_tfidf | 0.500 | 0.945 | 0.931 | 0.342 | 0.500 | 0.823 | 0.588 |
28
- | xgboost_tfidf | 0.177 | 0.934 | 0.592 | 0.570 | 0.581 | 0.823 | 0.588 |
29
- | embedding-logistic_sentence_embeddings | 0.500 | 0.912 | 0.476 | 0.861 | 0.613 | 0.953 | 0.762 |
30
- | embedding-logistic_sentence_embeddings | 0.722 | 0.957 | 0.703 | 0.810 | 0.753 | 0.953 | 0.762 |
31
- | embedding-svm_sentence_embeddings | 0.500 | 0.955 | 0.807 | 0.582 | 0.676 | 0.952 | 0.754 |
32
- | embedding-svm_sentence_embeddings | 0.310 | 0.957 | 0.713 | 0.785 | 0.747 | 0.952 | 0.754 |
33
- | embedding-lightgbm_sentence_embeddings | 0.500 | 0.954 | 0.750 | 0.646 | 0.694 | 0.948 | 0.782 |
34
- | embedding-lightgbm_sentence_embeddings | 0.042 | 0.952 | 0.670 | 0.797 | 0.728 | 0.948 | 0.782 |
35
- | transformer | 0.500 | 0.970 | 0.798 | 0.848 | 0.822 | 0.966 | 0.854 |
36
- | transformer | 0.471 | 0.971 | 0.800 | 0.861 | 0.829 | 0.966 | 0.854 |
37
 
38
  ## Threshold Comparison on Test Split
39
 
40
  | Model | Threshold | Accuracy | Precision | Recall | F1 | ROC AUC | Average precision |
41
  | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
42
- | logistic_tfidf | 0.500 | 0.926 | 0.691 | 0.598 | 0.641 | 0.899 | 0.726 |
43
- | logistic_tfidf | 0.608 | 0.930 | 0.902 | 0.411 | 0.564 | 0.899 | 0.726 |
44
- | xgboost_tfidf | 0.500 | 0.924 | 1.000 | 0.312 | 0.476 | 0.892 | 0.692 |
45
- | xgboost_tfidf | 0.177 | 0.918 | 0.663 | 0.527 | 0.587 | 0.892 | 0.692 |
46
- | embedding-logistic_sentence_embeddings | 0.500 | 0.891 | 0.503 | 0.884 | 0.641 | 0.955 | 0.710 |
47
- | embedding-logistic_sentence_embeddings | 0.722 | 0.935 | 0.689 | 0.750 | 0.718 | 0.955 | 0.710 |
48
- | embedding-svm_sentence_embeddings | 0.500 | 0.930 | 0.741 | 0.562 | 0.640 | 0.956 | 0.704 |
49
- | embedding-svm_sentence_embeddings | 0.310 | 0.934 | 0.686 | 0.741 | 0.712 | 0.956 | 0.704 |
50
- | embedding-lightgbm_sentence_embeddings | 0.500 | 0.937 | 0.740 | 0.661 | 0.698 | 0.960 | 0.791 |
51
- | embedding-lightgbm_sentence_embeddings | 0.042 | 0.929 | 0.639 | 0.821 | 0.719 | 0.960 | 0.791 |
52
- | transformer | 0.500 | 0.951 | 0.777 | 0.777 | 0.777 | 0.968 | 0.817 |
53
- | transformer | 0.471 | 0.950 | 0.770 | 0.777 | 0.773 | 0.968 | 0.817 |
54
 
55
  ## Confusion Matrices on Test Split
56
 
@@ -60,95 +60,95 @@ Rows are true labels and columns are predicted labels.
60
 
61
  | True / Predicted | NOT_RELEVANT | RELEVANT |
62
  | --- | ---: | ---: |
63
- | NOT_RELEVANT | 874 | 30 |
64
- | RELEVANT | 45 | 67 |
65
 
66
- ### logistic_tfidf at threshold 0.608
67
 
68
  | True / Predicted | NOT_RELEVANT | RELEVANT |
69
  | --- | ---: | ---: |
70
- | NOT_RELEVANT | 899 | 5 |
71
- | RELEVANT | 66 | 46 |
72
 
73
  ### xgboost_tfidf at threshold 0.500
74
 
75
  | True / Predicted | NOT_RELEVANT | RELEVANT |
76
  | --- | ---: | ---: |
77
- | NOT_RELEVANT | 904 | 0 |
78
- | RELEVANT | 77 | 35 |
79
 
80
- ### xgboost_tfidf at threshold 0.177
81
 
82
  | True / Predicted | NOT_RELEVANT | RELEVANT |
83
  | --- | ---: | ---: |
84
- | NOT_RELEVANT | 874 | 30 |
85
- | RELEVANT | 53 | 59 |
86
 
87
  ### embedding-logistic_sentence_embeddings at threshold 0.500
88
 
89
  | True / Predicted | NOT_RELEVANT | RELEVANT |
90
  | --- | ---: | ---: |
91
- | NOT_RELEVANT | 806 | 98 |
92
- | RELEVANT | 13 | 99 |
93
 
94
- ### embedding-logistic_sentence_embeddings at threshold 0.722
95
 
96
  | True / Predicted | NOT_RELEVANT | RELEVANT |
97
  | --- | ---: | ---: |
98
- | NOT_RELEVANT | 866 | 38 |
99
- | RELEVANT | 28 | 84 |
100
 
101
  ### embedding-svm_sentence_embeddings at threshold 0.500
102
 
103
  | True / Predicted | NOT_RELEVANT | RELEVANT |
104
  | --- | ---: | ---: |
105
- | NOT_RELEVANT | 882 | 22 |
106
- | RELEVANT | 49 | 63 |
107
 
108
- ### embedding-svm_sentence_embeddings at threshold 0.310
109
 
110
  | True / Predicted | NOT_RELEVANT | RELEVANT |
111
  | --- | ---: | ---: |
112
- | NOT_RELEVANT | 866 | 38 |
113
- | RELEVANT | 29 | 83 |
114
 
115
  ### embedding-lightgbm_sentence_embeddings at threshold 0.500
116
 
117
  | True / Predicted | NOT_RELEVANT | RELEVANT |
118
  | --- | ---: | ---: |
119
- | NOT_RELEVANT | 878 | 26 |
120
- | RELEVANT | 38 | 74 |
121
 
122
- ### embedding-lightgbm_sentence_embeddings at threshold 0.042
123
 
124
  | True / Predicted | NOT_RELEVANT | RELEVANT |
125
  | --- | ---: | ---: |
126
- | NOT_RELEVANT | 852 | 52 |
127
- | RELEVANT | 20 | 92 |
128
 
129
  ### transformer at threshold 0.500
130
 
131
  | True / Predicted | NOT_RELEVANT | RELEVANT |
132
  | --- | ---: | ---: |
133
- | NOT_RELEVANT | 879 | 25 |
134
- | RELEVANT | 25 | 87 |
135
 
136
- ### transformer at threshold 0.471
137
 
138
  | True / Predicted | NOT_RELEVANT | RELEVANT |
139
  | --- | ---: | ---: |
140
- | NOT_RELEVANT | 878 | 26 |
141
- | RELEVANT | 25 | 87 |
142
 
143
 
144
  ## Validation-Tuned Thresholds
145
 
146
- - `logistic_tfidf`: threshold `0.608` (validation F1 `0.578`); test F1 change vs 0.5: `-0.077`.
147
- - `xgboost_tfidf`: threshold `0.177` (validation F1 `0.581`); test F1 change vs 0.5: `+0.111`.
148
- - `embedding-logistic_sentence_embeddings`: threshold `0.722` (validation F1 `0.753`); test F1 change vs 0.5: `+0.077`.
149
- - `embedding-svm_sentence_embeddings`: threshold `0.310` (validation F1 `0.747`); test F1 change vs 0.5: `+0.073`.
150
- - `embedding-lightgbm_sentence_embeddings`: threshold `0.042` (validation F1 `0.728`); test F1 change vs 0.5: `+0.021`.
151
- - `transformer`: threshold `0.471` (validation F1 `0.829`); test F1 change vs 0.5: `-0.003`.
152
 
153
  ## Artifacts
154
 
 
2
 
3
  - Dataset: `faodl/amis-agri-utilization`
4
  - Dataset subset: ``
5
+ - Dataset revision: `main`
6
  - Text column: `chunk_text`
7
  - Label column: `label`
8
  - Transformer: `FacebookAI/xlm-roberta-base`
9
+ - Generated at: `2026-06-09T23:58:45.600559+00:00`
10
 
11
  ## Dataset Summary
12
 
13
  | Split | Rows | Label 0 | Label 1 | Unique groups | Mean text length |
14
  | --- | ---: | ---: | ---: | ---: | ---: |
15
+ | train | 9753 | 8950 | 803 | 4987 | 696.4 |
16
+ | validation | 2084 | 1885 | 199 | 1069 | 700.8 |
17
+ | test | 2086 | 1957 | 129 | 1069 | 701.6 |
18
 
19
  ## Threshold Comparison on Validation Split
20
 
 
22
 
23
  | Model | Threshold | Accuracy | Precision | Recall | F1 | ROC AUC | Average precision |
24
  | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
25
+ | logistic_tfidf | 0.500 | 0.901 | 0.482 | 0.462 | 0.472 | 0.867 | 0.496 |
26
+ | logistic_tfidf | 0.360 | 0.863 | 0.380 | 0.688 | 0.489 | 0.867 | 0.496 |
27
+ | xgboost_tfidf | 0.500 | 0.919 | 0.721 | 0.246 | 0.367 | 0.834 | 0.493 |
28
+ | xgboost_tfidf | 0.104 | 0.903 | 0.492 | 0.588 | 0.535 | 0.834 | 0.493 |
29
+ | embedding-logistic_sentence_embeddings | 0.500 | 0.895 | 0.474 | 0.869 | 0.613 | 0.952 | 0.652 |
30
+ | embedding-logistic_sentence_embeddings | 0.726 | 0.930 | 0.602 | 0.804 | 0.688 | 0.952 | 0.652 |
31
+ | embedding-svm_sentence_embeddings | 0.500 | 0.931 | 0.712 | 0.472 | 0.568 | 0.954 | 0.670 |
32
+ | embedding-svm_sentence_embeddings | 0.245 | 0.938 | 0.647 | 0.764 | 0.700 | 0.954 | 0.670 |
33
+ | embedding-lightgbm_sentence_embeddings | 0.500 | 0.937 | 0.681 | 0.633 | 0.656 | 0.954 | 0.669 |
34
+ | embedding-lightgbm_sentence_embeddings | 0.089 | 0.933 | 0.610 | 0.824 | 0.701 | 0.954 | 0.669 |
35
+ | transformer | 0.500 | 0.938 | 0.653 | 0.739 | 0.693 | 0.954 | 0.726 |
36
+ | transformer | 0.544 | 0.939 | 0.662 | 0.739 | 0.698 | 0.954 | 0.726 |
37
 
38
  ## Threshold Comparison on Test Split
39
 
40
  | Model | Threshold | Accuracy | Precision | Recall | F1 | ROC AUC | Average precision |
41
  | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
42
+ | logistic_tfidf | 0.500 | 0.918 | 0.358 | 0.419 | 0.386 | 0.856 | 0.398 |
43
+ | logistic_tfidf | 0.360 | 0.869 | 0.267 | 0.643 | 0.377 | 0.856 | 0.398 |
44
+ | xgboost_tfidf | 0.500 | 0.950 | 0.766 | 0.279 | 0.409 | 0.821 | 0.471 |
45
+ | xgboost_tfidf | 0.104 | 0.907 | 0.343 | 0.558 | 0.425 | 0.821 | 0.471 |
46
+ | embedding-logistic_sentence_embeddings | 0.500 | 0.891 | 0.350 | 0.884 | 0.501 | 0.951 | 0.543 |
47
+ | embedding-logistic_sentence_embeddings | 0.726 | 0.929 | 0.449 | 0.690 | 0.544 | 0.951 | 0.543 |
48
+ | embedding-svm_sentence_embeddings | 0.500 | 0.948 | 0.606 | 0.465 | 0.526 | 0.955 | 0.566 |
49
+ | embedding-svm_sentence_embeddings | 0.245 | 0.937 | 0.494 | 0.674 | 0.570 | 0.955 | 0.566 |
50
+ | embedding-lightgbm_sentence_embeddings | 0.500 | 0.948 | 0.579 | 0.597 | 0.588 | 0.948 | 0.585 |
51
+ | embedding-lightgbm_sentence_embeddings | 0.089 | 0.932 | 0.472 | 0.775 | 0.587 | 0.948 | 0.585 |
52
+ | transformer | 0.500 | 0.943 | 0.532 | 0.643 | 0.582 | 0.931 | 0.500 |
53
+ | transformer | 0.544 | 0.942 | 0.529 | 0.636 | 0.577 | 0.931 | 0.500 |
54
 
55
  ## Confusion Matrices on Test Split
56
 
 
60
 
61
  | True / Predicted | NOT_RELEVANT | RELEVANT |
62
  | --- | ---: | ---: |
63
+ | NOT_RELEVANT | 1860 | 97 |
64
+ | RELEVANT | 75 | 54 |
65
 
66
+ ### logistic_tfidf at threshold 0.360
67
 
68
  | True / Predicted | NOT_RELEVANT | RELEVANT |
69
  | --- | ---: | ---: |
70
+ | NOT_RELEVANT | 1729 | 228 |
71
+ | RELEVANT | 46 | 83 |
72
 
73
  ### xgboost_tfidf at threshold 0.500
74
 
75
  | True / Predicted | NOT_RELEVANT | RELEVANT |
76
  | --- | ---: | ---: |
77
+ | NOT_RELEVANT | 1946 | 11 |
78
+ | RELEVANT | 93 | 36 |
79
 
80
+ ### xgboost_tfidf at threshold 0.104
81
 
82
  | True / Predicted | NOT_RELEVANT | RELEVANT |
83
  | --- | ---: | ---: |
84
+ | NOT_RELEVANT | 1819 | 138 |
85
+ | RELEVANT | 57 | 72 |
86
 
87
  ### embedding-logistic_sentence_embeddings at threshold 0.500
88
 
89
  | True / Predicted | NOT_RELEVANT | RELEVANT |
90
  | --- | ---: | ---: |
91
+ | NOT_RELEVANT | 1745 | 212 |
92
+ | RELEVANT | 15 | 114 |
93
 
94
+ ### embedding-logistic_sentence_embeddings at threshold 0.726
95
 
96
  | True / Predicted | NOT_RELEVANT | RELEVANT |
97
  | --- | ---: | ---: |
98
+ | NOT_RELEVANT | 1848 | 109 |
99
+ | RELEVANT | 40 | 89 |
100
 
101
  ### embedding-svm_sentence_embeddings at threshold 0.500
102
 
103
  | True / Predicted | NOT_RELEVANT | RELEVANT |
104
  | --- | ---: | ---: |
105
+ | NOT_RELEVANT | 1918 | 39 |
106
+ | RELEVANT | 69 | 60 |
107
 
108
+ ### embedding-svm_sentence_embeddings at threshold 0.245
109
 
110
  | True / Predicted | NOT_RELEVANT | RELEVANT |
111
  | --- | ---: | ---: |
112
+ | NOT_RELEVANT | 1868 | 89 |
113
+ | RELEVANT | 42 | 87 |
114
 
115
  ### embedding-lightgbm_sentence_embeddings at threshold 0.500
116
 
117
  | True / Predicted | NOT_RELEVANT | RELEVANT |
118
  | --- | ---: | ---: |
119
+ | NOT_RELEVANT | 1901 | 56 |
120
+ | RELEVANT | 52 | 77 |
121
 
122
+ ### embedding-lightgbm_sentence_embeddings at threshold 0.089
123
 
124
  | True / Predicted | NOT_RELEVANT | RELEVANT |
125
  | --- | ---: | ---: |
126
+ | NOT_RELEVANT | 1845 | 112 |
127
+ | RELEVANT | 29 | 100 |
128
 
129
  ### transformer at threshold 0.500
130
 
131
  | True / Predicted | NOT_RELEVANT | RELEVANT |
132
  | --- | ---: | ---: |
133
+ | NOT_RELEVANT | 1884 | 73 |
134
+ | RELEVANT | 46 | 83 |
135
 
136
+ ### transformer at threshold 0.544
137
 
138
  | True / Predicted | NOT_RELEVANT | RELEVANT |
139
  | --- | ---: | ---: |
140
+ | NOT_RELEVANT | 1884 | 73 |
141
+ | RELEVANT | 47 | 82 |
142
 
143
 
144
  ## Validation-Tuned Thresholds
145
 
146
+ - `logistic_tfidf`: threshold `0.360` (validation F1 `0.489`); test F1 change vs 0.5: `-0.008`.
147
+ - `xgboost_tfidf`: threshold `0.104` (validation F1 `0.535`); test F1 change vs 0.5: `+0.016`.
148
+ - `embedding-logistic_sentence_embeddings`: threshold `0.726` (validation F1 `0.688`); test F1 change vs 0.5: `+0.043`.
149
+ - `embedding-svm_sentence_embeddings`: threshold `0.245` (validation F1 `0.700`); test F1 change vs 0.5: `+0.044`.
150
+ - `embedding-lightgbm_sentence_embeddings`: threshold `0.089` (validation F1 `0.701`); test F1 change vs 0.5: `-0.001`.
151
+ - `transformer`: threshold `0.544` (validation F1 `0.698`); test F1 change vs 0.5: `-0.005`.
152
 
153
  ## Artifacts
154
 
baselines/embedding-lightgbm/embedding-lightgbm.joblib CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:02039c6ee8487042ae61343afc227ab7375bbfdb042e073232a995d2e4d57dd6
3
- size 1467646
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ef46feb03d6d86629446a2b84e3976aad0bbf58a95a4a53a9fd370bd0fd97a5b
3
+ size 1454606
baselines/embedding-lightgbm/test_predictions.csv CHANGED
The diff for this file is too large to render. See raw diff
 
baselines/embedding-lightgbm/validation_predictions.csv CHANGED
The diff for this file is too large to render. See raw diff
 
baselines/embedding-logistic/embedding-logistic.joblib CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:433846875da231d3a97fc0f6bfa5adc3a1c4edb548d9655dc98a07523b436207
3
- size 4361
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2368ac2c3c1cc353bb281993f87b11f6ea0b4a86abd8d935a9506f607933b1ae
3
+ size 2821
baselines/embedding-logistic/test_predictions.csv CHANGED
The diff for this file is too large to render. See raw diff
 
baselines/embedding-logistic/validation_predictions.csv CHANGED
The diff for this file is too large to render. See raw diff
 
baselines/embedding-svm/embedding-svm.joblib CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:df3e6eaec015a205089efe2457d89d2ecacdf1661b8607ad60905ef318adc5f4
3
  size 11770
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6d0d609b2c746c6481cb61c52f997a53e8962c60aceda441970bbaeffd07223e
3
  size 11770
baselines/embedding-svm/test_predictions.csv CHANGED
The diff for this file is too large to render. See raw diff
 
baselines/embedding-svm/validation_predictions.csv CHANGED
The diff for this file is too large to render. See raw diff
 
baselines/logistic/logistic_tfidf.joblib CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:988b232ccc0c55fa1116c0885058e6200246e9dbe050debf6f5edfa81e0438e7
3
- size 2452308
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0ff51ca92cbba2cd25b3bd551d90fa24bbe8217cf6d701c493727b09df330af7
3
+ size 2430788
baselines/logistic/test_predictions.csv CHANGED
The diff for this file is too large to render. See raw diff
 
baselines/logistic/validation_predictions.csv CHANGED
The diff for this file is too large to render. See raw diff
 
baselines/xgboost/test_predictions.csv CHANGED
The diff for this file is too large to render. See raw diff
 
baselines/xgboost/validation_predictions.csv CHANGED
The diff for this file is too large to render. See raw diff
 
baselines/xgboost/xgboost_tfidf.joblib CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:75dae90ae561b6e87b2fd736393208127db3493eb3df7a2232490a3a60238d1b
3
- size 2494551
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a4b2edd904d295c0f6e835bcff08adff1506e9ce985ac708d392ce1c8dd97b2a
3
+ size 2483313
report.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
- "created_at": "2026-05-27T10:50:45.867038+00:00",
3
  "config": {
4
  "hf_dataset": "faodl/amis-agri-utilization",
5
  "hf_subset": null,
6
- "hf_revision": "ada4a04088a98f8f64bc7485c57d4c7f422c2151",
7
  "train_split": "train",
8
  "validation_split": "validation",
9
  "test_split": "test",
@@ -44,33 +44,33 @@
44
  },
45
  "dataset_summary": {
46
  "train": {
47
- "rows": 4877,
48
  "labels": {
49
- "0": 4347,
50
- "1": 530
51
  },
52
- "unique_groups": 2513,
53
- "text_length_mean": 696.6221037523068,
54
  "text_length_median": 794.0
55
  },
56
  "validation": {
57
- "rows": 978,
58
  "labels": {
59
- "0": 899,
60
- "1": 79
61
  },
62
- "unique_groups": 538,
63
- "text_length_mean": 690.6196319018405,
64
  "text_length_median": 794.0
65
  },
66
  "test": {
67
- "rows": 1016,
68
  "labels": {
69
- "0": 904,
70
- "1": 112
71
  },
72
- "unique_groups": 539,
73
- "text_length_mean": 690.6929133858267,
74
  "text_length_median": 794.0
75
  }
76
  },
@@ -81,194 +81,194 @@
81
  "artifact_dir": "/content/agri-utilization-classifier/baselines/logistic",
82
  "artifact_file": "/content/agri-utilization-classifier/baselines/logistic/logistic_tfidf.joblib",
83
  "validation_best_threshold": {
84
- "threshold": 0.6076606929552563,
85
- "f1": 0.5777777777777778,
86
- "precision": 0.6964285714285714,
87
- "recall": 0.4936708860759494
88
  },
89
  "validation_default_0_5": {
90
  "threshold": 0.5,
91
- "accuracy": 0.9120654396728016,
92
- "precision": 0.46464646464646464,
93
- "recall": 0.5822784810126582,
94
- "f1": 0.5168539325842697,
95
  "confusion_matrix": [
96
  [
97
- 846,
98
- 53
99
  ],
100
  [
101
- 33,
102
- 46
103
  ]
104
  ],
105
  "classification_report": {
106
  "NOT_RELEVANT": {
107
- "precision": 0.962457337883959,
108
- "recall": 0.9410456062291435,
109
- "f1-score": 0.9516310461192351,
110
- "support": 899.0
111
  },
112
  "RELEVANT": {
113
- "precision": 0.46464646464646464,
114
- "recall": 0.5822784810126582,
115
- "f1-score": 0.5168539325842697,
116
- "support": 79.0
117
  },
118
- "accuracy": 0.9120654396728016,
119
  "macro avg": {
120
- "precision": 0.7135519012652118,
121
- "recall": 0.7616620436209008,
122
- "f1-score": 0.7342424893517524,
123
- "support": 978.0
124
  },
125
  "weighted avg": {
126
- "precision": 0.9222456211296011,
127
- "recall": 0.9120654396728016,
128
- "f1-score": 0.9165110134308279,
129
- "support": 978.0
130
  }
131
  },
132
- "roc_auc": 0.871530955632841,
133
- "average_precision": 0.5935308473185881
134
  },
135
  "validation_optimal_threshold": {
136
- "threshold": 0.6076606929552563,
137
- "accuracy": 0.941717791411043,
138
- "precision": 0.6964285714285714,
139
- "recall": 0.4936708860759494,
140
- "f1": 0.5777777777777777,
141
  "confusion_matrix": [
142
  [
143
- 882,
144
- 17
145
  ],
146
  [
147
- 40,
148
- 39
149
  ]
150
  ],
151
  "classification_report": {
152
  "NOT_RELEVANT": {
153
- "precision": 0.9566160520607375,
154
- "recall": 0.9810901001112347,
155
- "f1-score": 0.9686985172981878,
156
- "support": 899.0
157
  },
158
  "RELEVANT": {
159
- "precision": 0.6964285714285714,
160
- "recall": 0.4936708860759494,
161
- "f1-score": 0.5777777777777777,
162
- "support": 79.0
163
  },
164
- "accuracy": 0.941717791411043,
165
  "macro avg": {
166
- "precision": 0.8265223117446545,
167
- "recall": 0.737380493093592,
168
- "f1-score": 0.7732381475379828,
169
- "support": 978.0
170
  },
171
  "weighted avg": {
172
- "precision": 0.9355988629299183,
173
- "recall": 0.941717791411043,
174
- "f1-score": 0.9371210751487886,
175
- "support": 978.0
176
  }
177
  },
178
- "roc_auc": 0.871530955632841,
179
- "average_precision": 0.5935308473185881
180
  },
181
  "test_default_0_5": {
182
  "threshold": 0.5,
183
- "accuracy": 0.9261811023622047,
184
- "precision": 0.6907216494845361,
185
- "recall": 0.5982142857142857,
186
- "f1": 0.6411483253588517,
187
  "confusion_matrix": [
188
  [
189
- 874,
190
- 30
191
  ],
192
  [
193
- 45,
194
- 67
195
  ]
196
  ],
197
  "classification_report": {
198
  "NOT_RELEVANT": {
199
- "precision": 0.9510337323177367,
200
- "recall": 0.9668141592920354,
201
- "f1-score": 0.9588590235874932,
202
- "support": 904.0
203
  },
204
  "RELEVANT": {
205
- "precision": 0.6907216494845361,
206
- "recall": 0.5982142857142857,
207
- "f1-score": 0.6411483253588517,
208
- "support": 112.0
209
  },
210
- "accuracy": 0.9261811023622047,
211
  "macro avg": {
212
- "precision": 0.8208776909011364,
213
- "recall": 0.7825142225031605,
214
- "f1-score": 0.8000036744731724,
215
- "support": 1016.0
216
  },
217
  "weighted avg": {
218
- "precision": 0.9223379121628957,
219
- "recall": 0.9261811023622047,
220
- "f1-score": 0.9238357970111075,
221
- "support": 1016.0
222
  }
223
  },
224
- "roc_auc": 0.8990004740834386,
225
- "average_precision": 0.7262348311700503
226
  },
227
  "test_optimal_threshold": {
228
- "threshold": 0.6076606929552563,
229
- "accuracy": 0.9301181102362205,
230
- "precision": 0.9019607843137255,
231
- "recall": 0.4107142857142857,
232
- "f1": 0.5644171779141104,
233
  "confusion_matrix": [
234
  [
235
- 899,
236
- 5
237
  ],
238
  [
239
- 66,
240
- 46
241
  ]
242
  ],
243
  "classification_report": {
244
  "NOT_RELEVANT": {
245
- "precision": 0.9316062176165804,
246
- "recall": 0.9944690265486725,
247
- "f1-score": 0.962011771000535,
248
- "support": 904.0
249
  },
250
  "RELEVANT": {
251
- "precision": 0.9019607843137255,
252
- "recall": 0.4107142857142857,
253
- "f1-score": 0.5644171779141104,
254
- "support": 112.0
255
  },
256
- "accuracy": 0.9301181102362205,
257
  "macro avg": {
258
- "precision": 0.9167835009651529,
259
- "recall": 0.7025916561314791,
260
- "f1-score": 0.7632144744573227,
261
- "support": 1016.0
262
  },
263
  "weighted avg": {
264
- "precision": 0.9283382170950057,
265
- "recall": 0.9301181102362205,
266
- "f1-score": 0.9181824457784095,
267
- "support": 1016.0
268
  }
269
  },
270
- "roc_auc": 0.8990004740834386,
271
- "average_precision": 0.7262348311700503
272
  }
273
  },
274
  {
@@ -277,194 +277,194 @@
277
  "artifact_dir": "/content/agri-utilization-classifier/baselines/xgboost",
278
  "artifact_file": "/content/agri-utilization-classifier/baselines/xgboost/xgboost_tfidf.joblib",
279
  "validation_best_threshold": {
280
- "threshold": 0.17728303372859955,
281
- "f1": 0.5806451612903226,
282
- "precision": 0.5921052631578947,
283
- "recall": 0.569620253164557
284
  },
285
  "validation_default_0_5": {
286
  "threshold": 0.5,
287
- "accuracy": 0.9447852760736196,
288
- "precision": 0.9310344827586207,
289
- "recall": 0.34177215189873417,
290
- "f1": 0.5,
291
  "confusion_matrix": [
292
  [
293
- 897,
294
- 2
295
  ],
296
  [
297
- 52,
298
- 27
299
  ]
300
  ],
301
  "classification_report": {
302
  "NOT_RELEVANT": {
303
- "precision": 0.9452054794520548,
304
- "recall": 0.9977753058954394,
305
- "f1-score": 0.9707792207792207,
306
- "support": 899.0
307
  },
308
  "RELEVANT": {
309
- "precision": 0.9310344827586207,
310
- "recall": 0.34177215189873417,
311
- "f1-score": 0.5,
312
- "support": 79.0
313
  },
314
- "accuracy": 0.9447852760736196,
315
  "macro avg": {
316
- "precision": 0.9381199811053378,
317
- "recall": 0.6697737288970868,
318
- "f1-score": 0.7353896103896104,
319
- "support": 978.0
320
  },
321
  "weighted avg": {
322
- "precision": 0.9440607874901108,
323
- "recall": 0.9447852760736196,
324
- "f1-score": 0.9327510424136191,
325
- "support": 978.0
326
  }
327
  },
328
- "roc_auc": 0.822629926359809,
329
- "average_precision": 0.5882293042162409
330
  },
331
  "validation_optimal_threshold": {
332
- "threshold": 0.17728303372859955,
333
- "accuracy": 0.9335378323108384,
334
- "precision": 0.5921052631578947,
335
- "recall": 0.569620253164557,
336
- "f1": 0.5806451612903226,
337
  "confusion_matrix": [
338
  [
339
- 868,
340
- 31
341
  ],
342
  [
343
- 34,
344
- 45
345
  ]
346
  ],
347
  "classification_report": {
348
  "NOT_RELEVANT": {
349
- "precision": 0.9623059866962306,
350
- "recall": 0.9655172413793104,
351
- "f1-score": 0.9639089394780678,
352
- "support": 899.0
353
  },
354
  "RELEVANT": {
355
- "precision": 0.5921052631578947,
356
- "recall": 0.569620253164557,
357
- "f1-score": 0.5806451612903226,
358
- "support": 79.0
359
  },
360
- "accuracy": 0.9335378323108384,
361
  "macro avg": {
362
- "precision": 0.7772056249270627,
363
- "recall": 0.7675687472719337,
364
- "f1-score": 0.7722770503841951,
365
- "support": 978.0
366
  },
367
  "weighted avg": {
368
- "precision": 0.9324022472693098,
369
- "recall": 0.9335378323108384,
370
- "f1-score": 0.9329500044301825,
371
- "support": 978.0
372
  }
373
  },
374
- "roc_auc": 0.822629926359809,
375
- "average_precision": 0.5882293042162409
376
  },
377
  "test_default_0_5": {
378
  "threshold": 0.5,
379
- "accuracy": 0.9242125984251969,
380
- "precision": 1.0,
381
- "recall": 0.3125,
382
- "f1": 0.47619047619047616,
383
  "confusion_matrix": [
384
  [
385
- 904,
386
- 0
387
  ],
388
  [
389
- 77,
390
- 35
391
  ]
392
  ],
393
  "classification_report": {
394
  "NOT_RELEVANT": {
395
- "precision": 0.9215086646279307,
396
- "recall": 1.0,
397
- "f1-score": 0.9591511936339523,
398
- "support": 904.0
399
  },
400
  "RELEVANT": {
401
- "precision": 1.0,
402
- "recall": 0.3125,
403
- "f1-score": 0.47619047619047616,
404
- "support": 112.0
405
  },
406
- "accuracy": 0.9242125984251969,
407
  "macro avg": {
408
- "precision": 0.9607543323139653,
409
- "recall": 0.65625,
410
- "f1-score": 0.7176708349122143,
411
- "support": 1016.0
412
  },
413
  "weighted avg": {
414
- "precision": 0.9301612527791825,
415
- "recall": 0.9242125984251969,
416
- "f1-score": 0.905911429506325,
417
- "support": 1016.0
418
  }
419
  },
420
- "roc_auc": 0.8921114491150443,
421
- "average_precision": 0.6916666494483661
422
  },
423
  "test_optimal_threshold": {
424
- "threshold": 0.17728303372859955,
425
- "accuracy": 0.9183070866141733,
426
- "precision": 0.6629213483146067,
427
- "recall": 0.5267857142857143,
428
- "f1": 0.5870646766169154,
429
  "confusion_matrix": [
430
  [
431
- 874,
432
- 30
433
  ],
434
  [
435
- 53,
436
- 59
437
  ]
438
  ],
439
  "classification_report": {
440
  "NOT_RELEVANT": {
441
- "precision": 0.9428263214670982,
442
- "recall": 0.9668141592920354,
443
- "f1-score": 0.9546695794647734,
444
- "support": 904.0
445
  },
446
  "RELEVANT": {
447
- "precision": 0.6629213483146067,
448
- "recall": 0.5267857142857143,
449
- "f1-score": 0.5870646766169154,
450
- "support": 112.0
451
  },
452
- "accuracy": 0.9183070866141733,
453
  "macro avg": {
454
- "precision": 0.8028738348908524,
455
- "recall": 0.7467999367888749,
456
- "f1-score": 0.7708671280408443,
457
- "support": 1016.0
458
  },
459
  "weighted avg": {
460
- "precision": 0.9119706551353274,
461
- "recall": 0.9183070866141733,
462
- "f1-score": 0.9141462043476867,
463
- "support": 1016.0
464
  }
465
  },
466
- "roc_auc": 0.8921114491150443,
467
- "average_precision": 0.6916666494483661
468
  }
469
  },
470
  {
@@ -474,194 +474,194 @@
474
  "artifact_dir": "/content/agri-utilization-classifier/baselines/embedding-logistic",
475
  "artifact_file": "/content/agri-utilization-classifier/baselines/embedding-logistic/embedding-logistic.joblib",
476
  "validation_best_threshold": {
477
- "threshold": 0.7220406191151401,
478
- "f1": 0.7529411764705883,
479
- "precision": 0.7032967032967034,
480
- "recall": 0.810126582278481
481
  },
482
  "validation_default_0_5": {
483
  "threshold": 0.5,
484
- "accuracy": 0.9120654396728016,
485
- "precision": 0.4755244755244755,
486
- "recall": 0.8607594936708861,
487
- "f1": 0.6126126126126126,
488
  "confusion_matrix": [
489
  [
490
- 824,
491
- 75
492
  ],
493
  [
494
- 11,
495
- 68
496
  ]
497
  ],
498
  "classification_report": {
499
  "NOT_RELEVANT": {
500
- "precision": 0.9868263473053892,
501
- "recall": 0.9165739710789766,
502
- "f1-score": 0.9504036908881199,
503
- "support": 899.0
504
  },
505
  "RELEVANT": {
506
- "precision": 0.4755244755244755,
507
- "recall": 0.8607594936708861,
508
- "f1-score": 0.6126126126126126,
509
- "support": 79.0
510
  },
511
- "accuracy": 0.9120654396728016,
512
  "macro avg": {
513
- "precision": 0.7311754114149324,
514
- "recall": 0.8886667323749313,
515
- "f1-score": 0.7815081517503663,
516
- "support": 978.0
517
  },
518
  "weighted avg": {
519
- "precision": 0.9455248668650087,
520
- "recall": 0.9120654396728016,
521
- "f1-score": 0.9231179084916321,
522
- "support": 978.0
523
  }
524
  },
525
- "roc_auc": 0.9525633263400967,
526
- "average_precision": 0.7622834015915168
527
  },
528
  "validation_optimal_threshold": {
529
- "threshold": 0.7220406191151401,
530
- "accuracy": 0.9570552147239264,
531
- "precision": 0.7032967032967034,
532
- "recall": 0.810126582278481,
533
- "f1": 0.7529411764705882,
534
  "confusion_matrix": [
535
  [
536
- 872,
537
- 27
538
  ],
539
  [
540
- 15,
541
- 64
542
  ]
543
  ],
544
  "classification_report": {
545
  "NOT_RELEVANT": {
546
- "precision": 0.9830890642615558,
547
- "recall": 0.9699666295884316,
548
- "f1-score": 0.9764837625979843,
549
- "support": 899.0
550
  },
551
  "RELEVANT": {
552
- "precision": 0.7032967032967034,
553
- "recall": 0.810126582278481,
554
- "f1-score": 0.7529411764705882,
555
- "support": 79.0
556
  },
557
- "accuracy": 0.9570552147239264,
558
  "macro avg": {
559
- "precision": 0.8431928837791296,
560
- "recall": 0.8900466059334563,
561
- "f1-score": 0.8647124695342863,
562
- "support": 978.0
563
  },
564
  "weighted avg": {
565
- "precision": 0.9604882498277897,
566
- "recall": 0.9570552147239264,
567
- "f1-score": 0.9584266416326834,
568
- "support": 978.0
569
  }
570
  },
571
- "roc_auc": 0.9525633263400967,
572
- "average_precision": 0.7622834015915168
573
  },
574
  "test_default_0_5": {
575
  "threshold": 0.5,
576
- "accuracy": 0.890748031496063,
577
- "precision": 0.5025380710659898,
578
- "recall": 0.8839285714285714,
579
- "f1": 0.6407766990291263,
580
  "confusion_matrix": [
581
  [
582
- 806,
583
- 98
584
  ],
585
  [
586
- 13,
587
- 99
588
  ]
589
  ],
590
  "classification_report": {
591
  "NOT_RELEVANT": {
592
- "precision": 0.9841269841269841,
593
- "recall": 0.8915929203539823,
594
- "f1-score": 0.9355774811375508,
595
- "support": 904.0
596
  },
597
  "RELEVANT": {
598
- "precision": 0.5025380710659898,
599
- "recall": 0.8839285714285714,
600
- "f1-score": 0.6407766990291263,
601
- "support": 112.0
602
  },
603
- "accuracy": 0.890748031496063,
604
  "macro avg": {
605
- "precision": 0.7433325275964869,
606
- "recall": 0.8877607458912768,
607
- "f1-score": 0.7881770900833385,
608
- "support": 1016.0
609
  },
610
  "weighted avg": {
611
- "precision": 0.9310384425297091,
612
- "recall": 0.890748031496063,
613
- "f1-score": 0.9030797571255984,
614
- "support": 1016.0
615
  }
616
  },
617
- "roc_auc": 0.955317635903919,
618
- "average_precision": 0.7096184898069098
619
  },
620
  "test_optimal_threshold": {
621
- "threshold": 0.7220406191151401,
622
- "accuracy": 0.9350393700787402,
623
- "precision": 0.6885245901639344,
624
- "recall": 0.75,
625
- "f1": 0.717948717948718,
626
  "confusion_matrix": [
627
  [
628
- 866,
629
- 38
630
  ],
631
  [
632
- 28,
633
- 84
634
  ]
635
  ],
636
  "classification_report": {
637
  "NOT_RELEVANT": {
638
- "precision": 0.9686800894854586,
639
- "recall": 0.9579646017699115,
640
- "f1-score": 0.9632925472747497,
641
- "support": 904.0
642
  },
643
  "RELEVANT": {
644
- "precision": 0.6885245901639344,
645
- "recall": 0.75,
646
- "f1-score": 0.717948717948718,
647
- "support": 112.0
648
  },
649
- "accuracy": 0.9350393700787402,
650
  "macro avg": {
651
- "precision": 0.8286023398246964,
652
- "recall": 0.8539823008849557,
653
- "f1-score": 0.8406206326117338,
654
- "support": 1016.0
655
  },
656
  "weighted avg": {
657
- "precision": 0.9377968060956843,
658
- "recall": 0.9350393700787402,
659
- "f1-score": 0.9362467708136123,
660
- "support": 1016.0
661
  }
662
  },
663
- "roc_auc": 0.955317635903919,
664
- "average_precision": 0.7096184898069098
665
  }
666
  },
667
  {
@@ -671,194 +671,194 @@
671
  "artifact_dir": "/content/agri-utilization-classifier/baselines/embedding-svm",
672
  "artifact_file": "/content/agri-utilization-classifier/baselines/embedding-svm/embedding-svm.joblib",
673
  "validation_best_threshold": {
674
- "threshold": 0.30975184413575924,
675
- "f1": 0.746987951807229,
676
- "precision": 0.7126436781609196,
677
- "recall": 0.7848101265822784
678
  },
679
  "validation_default_0_5": {
680
  "threshold": 0.5,
681
- "accuracy": 0.9550102249488752,
682
- "precision": 0.8070175438596491,
683
- "recall": 0.5822784810126582,
684
- "f1": 0.6764705882352942,
685
  "confusion_matrix": [
686
  [
687
- 888,
688
- 11
689
  ],
690
  [
691
- 33,
692
- 46
693
  ]
694
  ],
695
  "classification_report": {
696
  "NOT_RELEVANT": {
697
- "precision": 0.9641693811074918,
698
- "recall": 0.9877641824249166,
699
- "f1-score": 0.9758241758241758,
700
- "support": 899.0
701
  },
702
  "RELEVANT": {
703
- "precision": 0.8070175438596491,
704
- "recall": 0.5822784810126582,
705
- "f1-score": 0.6764705882352942,
706
- "support": 79.0
707
  },
708
- "accuracy": 0.9550102249488752,
709
  "macro avg": {
710
- "precision": 0.8855934624835704,
711
- "recall": 0.7850213317187874,
712
- "f1-score": 0.8261473820297349,
713
- "support": 978.0
714
  },
715
  "weighted avg": {
716
- "precision": 0.9514751120455496,
717
- "recall": 0.9550102249488752,
718
- "f1-score": 0.9516432623072826,
719
- "support": 978.0
720
  }
721
  },
722
- "roc_auc": 0.9524506836006251,
723
- "average_precision": 0.7542419360138435
724
  },
725
  "validation_optimal_threshold": {
726
- "threshold": 0.30975184413575924,
727
- "accuracy": 0.9570552147239264,
728
- "precision": 0.7126436781609196,
729
- "recall": 0.7848101265822784,
730
- "f1": 0.7469879518072289,
731
  "confusion_matrix": [
732
  [
733
- 874,
734
- 25
735
  ],
736
  [
737
- 17,
738
- 62
739
  ]
740
  ],
741
  "classification_report": {
742
  "NOT_RELEVANT": {
743
- "precision": 0.9809203142536476,
744
- "recall": 0.9721913236929922,
745
- "f1-score": 0.976536312849162,
746
- "support": 899.0
747
  },
748
  "RELEVANT": {
749
- "precision": 0.7126436781609196,
750
- "recall": 0.7848101265822784,
751
- "f1-score": 0.7469879518072289,
752
- "support": 79.0
753
  },
754
- "accuracy": 0.9570552147239264,
755
  "macro avg": {
756
- "precision": 0.8467819962072836,
757
- "recall": 0.8785007251376353,
758
- "f1-score": 0.8617621323281954,
759
- "support": 978.0
760
  },
761
  "weighted avg": {
762
- "precision": 0.9592497066347054,
763
- "recall": 0.9570552147239264,
764
- "f1-score": 0.9579940628263474,
765
- "support": 978.0
766
  }
767
  },
768
- "roc_auc": 0.9524506836006251,
769
- "average_precision": 0.7542419360138435
770
  },
771
  "test_default_0_5": {
772
  "threshold": 0.5,
773
- "accuracy": 0.9301181102362205,
774
- "precision": 0.7411764705882353,
775
- "recall": 0.5625,
776
- "f1": 0.6395939086294417,
777
  "confusion_matrix": [
778
  [
779
- 882,
780
- 22
781
  ],
782
  [
783
- 49,
784
- 63
785
  ]
786
  ],
787
  "classification_report": {
788
  "NOT_RELEVANT": {
789
- "precision": 0.9473684210526315,
790
- "recall": 0.9756637168141593,
791
- "f1-score": 0.9613079019073569,
792
- "support": 904.0
793
  },
794
  "RELEVANT": {
795
- "precision": 0.7411764705882353,
796
- "recall": 0.5625,
797
- "f1-score": 0.6395939086294417,
798
- "support": 112.0
799
  },
800
- "accuracy": 0.9301181102362205,
801
  "macro avg": {
802
- "precision": 0.8442724458204334,
803
- "recall": 0.7690818584070797,
804
- "f1-score": 0.8004509052683992,
805
- "support": 1016.0
806
  },
807
  "weighted avg": {
808
- "precision": 0.9246385997415957,
809
- "recall": 0.9301181102362205,
810
- "f1-score": 0.9258433672153032,
811
- "support": 1016.0
812
  }
813
  },
814
- "roc_auc": 0.9563744469026548,
815
- "average_precision": 0.7035914186137721
816
  },
817
  "test_optimal_threshold": {
818
- "threshold": 0.30975184413575924,
819
- "accuracy": 0.9340551181102362,
820
- "precision": 0.6859504132231405,
821
- "recall": 0.7410714285714286,
822
- "f1": 0.7124463519313304,
823
  "confusion_matrix": [
824
  [
825
- 866,
826
- 38
827
  ],
828
  [
829
- 29,
830
- 83
831
  ]
832
  ],
833
  "classification_report": {
834
  "NOT_RELEVANT": {
835
- "precision": 0.9675977653631285,
836
- "recall": 0.9579646017699115,
837
- "f1-score": 0.962757087270706,
838
- "support": 904.0
839
  },
840
  "RELEVANT": {
841
- "precision": 0.6859504132231405,
842
- "recall": 0.7410714285714286,
843
- "f1-score": 0.7124463519313304,
844
- "support": 112.0
845
  },
846
- "accuracy": 0.9340551181102362,
847
  "macro avg": {
848
- "precision": 0.8267740892931346,
849
- "recall": 0.84951801517067,
850
- "f1-score": 0.8376017196010181,
851
- "support": 1016.0
852
  },
853
  "weighted avg": {
854
- "precision": 0.9365500257571455,
855
- "recall": 0.9340551181102362,
856
- "f1-score": 0.9351637778632157,
857
- "support": 1016.0
858
  }
859
  },
860
- "roc_auc": 0.9563744469026548,
861
- "average_precision": 0.7035914186137721
862
  }
863
  },
864
  {
@@ -868,194 +868,194 @@
868
  "artifact_dir": "/content/agri-utilization-classifier/baselines/embedding-lightgbm",
869
  "artifact_file": "/content/agri-utilization-classifier/baselines/embedding-lightgbm/embedding-lightgbm.joblib",
870
  "validation_best_threshold": {
871
- "threshold": 0.042041465431985434,
872
- "f1": 0.7283236994219654,
873
- "precision": 0.6702127659574468,
874
- "recall": 0.7974683544303798
875
  },
876
  "validation_default_0_5": {
877
  "threshold": 0.5,
878
- "accuracy": 0.9539877300613497,
879
- "precision": 0.75,
880
- "recall": 0.6455696202531646,
881
- "f1": 0.6938775510204082,
882
  "confusion_matrix": [
883
  [
884
- 882,
885
- 17
886
  ],
887
  [
888
- 28,
889
- 51
890
  ]
891
  ],
892
  "classification_report": {
893
  "NOT_RELEVANT": {
894
- "precision": 0.9692307692307692,
895
- "recall": 0.9810901001112347,
896
- "f1-score": 0.9751243781094527,
897
- "support": 899.0
898
  },
899
  "RELEVANT": {
900
- "precision": 0.75,
901
- "recall": 0.6455696202531646,
902
- "f1-score": 0.6938775510204082,
903
- "support": 79.0
904
  },
905
- "accuracy": 0.9539877300613497,
906
  "macro avg": {
907
- "precision": 0.8596153846153847,
908
- "recall": 0.8133298601821997,
909
- "f1-score": 0.8345009645649304,
910
- "support": 978.0
911
  },
912
  "weighted avg": {
913
- "precision": 0.9515219443133554,
914
- "recall": 0.9539877300613497,
915
- "f1-score": 0.9524060761257774,
916
- "support": 978.0
917
  }
918
  },
919
- "roc_auc": 0.9480716971036736,
920
- "average_precision": 0.7818499996214695
921
  },
922
  "validation_optimal_threshold": {
923
- "threshold": 0.042041465431985434,
924
- "accuracy": 0.9519427402862985,
925
- "precision": 0.6702127659574468,
926
- "recall": 0.7974683544303798,
927
- "f1": 0.7283236994219653,
928
  "confusion_matrix": [
929
  [
930
- 868,
931
- 31
932
  ],
933
  [
934
- 16,
935
- 63
936
  ]
937
  ],
938
  "classification_report": {
939
  "NOT_RELEVANT": {
940
- "precision": 0.9819004524886877,
941
- "recall": 0.9655172413793104,
942
- "f1-score": 0.9736399326977005,
943
- "support": 899.0
944
  },
945
  "RELEVANT": {
946
- "precision": 0.6702127659574468,
947
- "recall": 0.7974683544303798,
948
- "f1-score": 0.7283236994219653,
949
- "support": 79.0
950
  },
951
- "accuracy": 0.9519427402862985,
952
  "macro avg": {
953
- "precision": 0.8260566092230672,
954
- "recall": 0.881492797904845,
955
- "f1-score": 0.8509818160598329,
956
- "support": 978.0
957
  },
958
  "weighted avg": {
959
- "precision": 0.9567232262760416,
960
- "recall": 0.9519427402862985,
961
- "f1-score": 0.9538239997439346,
962
- "support": 978.0
963
  }
964
  },
965
- "roc_auc": 0.9480716971036736,
966
- "average_precision": 0.7818499996214695
967
  },
968
  "test_default_0_5": {
969
  "threshold": 0.5,
970
- "accuracy": 0.937007874015748,
971
- "precision": 0.74,
972
- "recall": 0.6607142857142857,
973
- "f1": 0.6981132075471698,
974
  "confusion_matrix": [
975
  [
976
- 878,
977
- 26
978
  ],
979
  [
980
- 38,
981
- 74
982
  ]
983
  ],
984
  "classification_report": {
985
  "NOT_RELEVANT": {
986
- "precision": 0.9585152838427947,
987
- "recall": 0.9712389380530974,
988
- "f1-score": 0.9648351648351648,
989
- "support": 904.0
990
  },
991
  "RELEVANT": {
992
- "precision": 0.74,
993
- "recall": 0.6607142857142857,
994
- "f1-score": 0.6981132075471698,
995
- "support": 112.0
996
  },
997
- "accuracy": 0.937007874015748,
998
  "macro avg": {
999
- "precision": 0.8492576419213973,
1000
- "recall": 0.8159766118836915,
1001
- "f1-score": 0.8314741861911673,
1002
- "support": 1016.0
1003
  },
1004
  "weighted avg": {
1005
- "precision": 0.9344269848365024,
1006
- "recall": 0.937007874015748,
1007
- "f1-score": 0.9354327443467244,
1008
- "support": 1016.0
1009
  }
1010
  },
1011
- "roc_auc": 0.9597819216182049,
1012
- "average_precision": 0.7911233572387708
1013
  },
1014
  "test_optimal_threshold": {
1015
- "threshold": 0.042041465431985434,
1016
- "accuracy": 0.9291338582677166,
1017
- "precision": 0.6388888888888888,
1018
- "recall": 0.8214285714285714,
1019
- "f1": 0.71875,
1020
  "confusion_matrix": [
1021
  [
1022
- 852,
1023
- 52
1024
  ],
1025
  [
1026
- 20,
1027
- 92
1028
  ]
1029
  ],
1030
  "classification_report": {
1031
  "NOT_RELEVANT": {
1032
- "precision": 0.9770642201834863,
1033
- "recall": 0.9424778761061947,
1034
- "f1-score": 0.9594594594594594,
1035
- "support": 904.0
1036
  },
1037
  "RELEVANT": {
1038
- "precision": 0.6388888888888888,
1039
- "recall": 0.8214285714285714,
1040
- "f1-score": 0.71875,
1041
- "support": 112.0
1042
  },
1043
- "accuracy": 0.9291338582677166,
1044
  "macro avg": {
1045
- "precision": 0.8079765545361876,
1046
- "recall": 0.881953223767383,
1047
- "f1-score": 0.8391047297297297,
1048
- "support": 1016.0
1049
  },
1050
  "weighted avg": {
1051
- "precision": 0.9397850498045542,
1052
- "recall": 0.9291338582677166,
1053
- "f1-score": 0.9329245584166844,
1054
- "support": 1016.0
1055
  }
1056
  },
1057
- "roc_auc": 0.9597819216182049,
1058
- "average_precision": 0.7911233572387708
1059
  }
1060
  },
1061
  {
@@ -1063,194 +1063,194 @@
1063
  "model_name": "FacebookAI/xlm-roberta-base",
1064
  "artifact_dir": "/content/agri-utilization-classifier/transformer",
1065
  "validation_best_threshold": {
1066
- "threshold": 0.4710787534713745,
1067
- "f1": 0.829268292682927,
1068
- "precision": 0.8,
1069
- "recall": 0.8607594936708861
1070
  },
1071
  "validation_default_0_5": {
1072
  "threshold": 0.5,
1073
- "accuracy": 0.9703476482617587,
1074
- "precision": 0.7976190476190477,
1075
- "recall": 0.8481012658227848,
1076
- "f1": 0.8220858895705522,
1077
  "confusion_matrix": [
1078
  [
1079
- 882,
1080
- 17
1081
  ],
1082
  [
1083
- 12,
1084
- 67
1085
  ]
1086
  ],
1087
  "classification_report": {
1088
  "NOT_RELEVANT": {
1089
- "precision": 0.9865771812080537,
1090
- "recall": 0.9810901001112347,
1091
- "f1-score": 0.9838259899609593,
1092
- "support": 899.0
1093
  },
1094
  "RELEVANT": {
1095
- "precision": 0.7976190476190477,
1096
- "recall": 0.8481012658227848,
1097
- "f1-score": 0.8220858895705522,
1098
- "support": 79.0
1099
  },
1100
- "accuracy": 0.9703476482617587,
1101
  "macro avg": {
1102
- "precision": 0.8920981144135507,
1103
- "recall": 0.9145956829670097,
1104
- "f1-score": 0.9029559397657557,
1105
- "support": 978.0
1106
  },
1107
  "weighted avg": {
1108
- "precision": 0.9713136918895144,
1109
- "recall": 0.9703476482617587,
1110
- "f1-score": 0.9707610943261513,
1111
- "support": 978.0
1112
  }
1113
  },
1114
- "roc_auc": 0.9661086157615353,
1115
- "average_precision": 0.8539255147550682
1116
  },
1117
  "validation_optimal_threshold": {
1118
- "threshold": 0.4710787534713745,
1119
- "accuracy": 0.9713701431492843,
1120
- "precision": 0.8,
1121
- "recall": 0.8607594936708861,
1122
- "f1": 0.8292682926829268,
1123
  "confusion_matrix": [
1124
  [
1125
- 882,
1126
- 17
1127
  ],
1128
  [
1129
- 11,
1130
- 68
1131
  ]
1132
  ],
1133
  "classification_report": {
1134
  "NOT_RELEVANT": {
1135
- "precision": 0.9876819708846585,
1136
- "recall": 0.9810901001112347,
1137
- "f1-score": 0.984375,
1138
- "support": 899.0
1139
  },
1140
  "RELEVANT": {
1141
- "precision": 0.8,
1142
- "recall": 0.8607594936708861,
1143
- "f1-score": 0.8292682926829268,
1144
- "support": 79.0
1145
  },
1146
- "accuracy": 0.9713701431492843,
1147
  "macro avg": {
1148
- "precision": 0.8938409854423293,
1149
- "recall": 0.9209247968910603,
1150
- "f1-score": 0.9068216463414633,
1151
- "support": 978.0
1152
  },
1153
  "weighted avg": {
1154
- "precision": 0.972521566283546,
1155
- "recall": 0.9713701431492843,
1156
- "f1-score": 0.9718459305950421,
1157
- "support": 978.0
1158
  }
1159
  },
1160
- "roc_auc": 0.9661086157615353,
1161
- "average_precision": 0.8539255147550682
1162
  },
1163
  "test_default_0_5": {
1164
  "threshold": 0.5,
1165
- "accuracy": 0.9507874015748031,
1166
- "precision": 0.7767857142857143,
1167
- "recall": 0.7767857142857143,
1168
- "f1": 0.7767857142857143,
1169
  "confusion_matrix": [
1170
  [
1171
- 879,
1172
- 25
1173
  ],
1174
  [
1175
- 25,
1176
- 87
1177
  ]
1178
  ],
1179
  "classification_report": {
1180
  "NOT_RELEVANT": {
1181
- "precision": 0.9723451327433629,
1182
- "recall": 0.9723451327433629,
1183
- "f1-score": 0.9723451327433629,
1184
- "support": 904.0
1185
  },
1186
  "RELEVANT": {
1187
- "precision": 0.7767857142857143,
1188
- "recall": 0.7767857142857143,
1189
- "f1-score": 0.7767857142857143,
1190
- "support": 112.0
1191
  },
1192
- "accuracy": 0.9507874015748031,
1193
  "macro avg": {
1194
- "precision": 0.8745654235145386,
1195
- "recall": 0.8745654235145386,
1196
- "f1-score": 0.8745654235145386,
1197
- "support": 1016.0
1198
  },
1199
  "weighted avg": {
1200
- "precision": 0.9507874015748031,
1201
- "recall": 0.9507874015748031,
1202
- "f1-score": 0.9507874015748031,
1203
- "support": 1016.0
1204
  }
1205
  },
1206
- "roc_auc": 0.9682512247155499,
1207
- "average_precision": 0.8171206633671375
1208
  },
1209
  "test_optimal_threshold": {
1210
- "threshold": 0.4710787534713745,
1211
- "accuracy": 0.9498031496062992,
1212
- "precision": 0.7699115044247787,
1213
- "recall": 0.7767857142857143,
1214
- "f1": 0.7733333333333333,
1215
  "confusion_matrix": [
1216
  [
1217
- 878,
1218
- 26
1219
  ],
1220
  [
1221
- 25,
1222
- 87
1223
  ]
1224
  ],
1225
  "classification_report": {
1226
  "NOT_RELEVANT": {
1227
- "precision": 0.9723145071982281,
1228
- "recall": 0.9712389380530974,
1229
- "f1-score": 0.9717764250138351,
1230
- "support": 904.0
1231
  },
1232
  "RELEVANT": {
1233
- "precision": 0.7699115044247787,
1234
- "recall": 0.7767857142857143,
1235
- "f1-score": 0.7733333333333333,
1236
- "support": 112.0
1237
  },
1238
- "accuracy": 0.9498031496062992,
1239
  "macro avg": {
1240
- "precision": 0.8711130058115034,
1241
- "recall": 0.8740123261694058,
1242
- "f1-score": 0.8725548791735842,
1243
- "support": 1016.0
1244
  },
1245
  "weighted avg": {
1246
- "precision": 0.9500023651602102,
1247
- "recall": 0.9498031496062992,
1248
- "f1-score": 0.9499008086081104,
1249
- "support": 1016.0
1250
  }
1251
  },
1252
- "roc_auc": 0.9682512247155499,
1253
- "average_precision": 0.8171206633671375
1254
  }
1255
  }
1256
  ]
 
1
  {
2
+ "created_at": "2026-06-09T23:58:45.600559+00:00",
3
  "config": {
4
  "hf_dataset": "faodl/amis-agri-utilization",
5
  "hf_subset": null,
6
+ "hf_revision": "main",
7
  "train_split": "train",
8
  "validation_split": "validation",
9
  "test_split": "test",
 
44
  },
45
  "dataset_summary": {
46
  "train": {
47
+ "rows": 9753,
48
  "labels": {
49
+ "0": 8950,
50
+ "1": 803
51
  },
52
+ "unique_groups": 4987,
53
+ "text_length_mean": 696.3964933866503,
54
  "text_length_median": 794.0
55
  },
56
  "validation": {
57
+ "rows": 2084,
58
  "labels": {
59
+ "0": 1885,
60
+ "1": 199
61
  },
62
+ "unique_groups": 1069,
63
+ "text_length_mean": 700.8267754318618,
64
  "text_length_median": 794.0
65
  },
66
  "test": {
67
+ "rows": 2086,
68
  "labels": {
69
+ "0": 1957,
70
+ "1": 129
71
  },
72
+ "unique_groups": 1069,
73
+ "text_length_mean": 701.6332694151486,
74
  "text_length_median": 794.0
75
  }
76
  },
 
81
  "artifact_dir": "/content/agri-utilization-classifier/baselines/logistic",
82
  "artifact_file": "/content/agri-utilization-classifier/baselines/logistic/logistic_tfidf.joblib",
83
  "validation_best_threshold": {
84
+ "threshold": 0.36023362771573536,
85
+ "f1": 0.48928571428571427,
86
+ "precision": 0.37950138504155123,
87
+ "recall": 0.6884422110552764
88
  },
89
  "validation_default_0_5": {
90
  "threshold": 0.5,
91
+ "accuracy": 0.9011516314779271,
92
+ "precision": 0.4816753926701571,
93
+ "recall": 0.4623115577889447,
94
+ "f1": 0.4717948717948718,
95
  "confusion_matrix": [
96
  [
97
+ 1786,
98
+ 99
99
  ],
100
  [
101
+ 107,
102
+ 92
103
  ]
104
  ],
105
  "classification_report": {
106
  "NOT_RELEVANT": {
107
+ "precision": 0.9434759640781828,
108
+ "recall": 0.9474801061007958,
109
+ "f1-score": 0.9454737956590789,
110
+ "support": 1885.0
111
  },
112
  "RELEVANT": {
113
+ "precision": 0.4816753926701571,
114
+ "recall": 0.4623115577889447,
115
+ "f1-score": 0.4717948717948718,
116
+ "support": 199.0
117
  },
118
+ "accuracy": 0.9011516314779271,
119
  "macro avg": {
120
+ "precision": 0.7125756783741699,
121
+ "recall": 0.7048958319448703,
122
+ "f1-score": 0.7086343337269754,
123
+ "support": 2084.0
124
  },
125
  "weighted avg": {
126
+ "precision": 0.8993788845627332,
127
+ "recall": 0.9011516314779271,
128
+ "f1-score": 0.9002424588793393,
129
+ "support": 2084.0
130
  }
131
  },
132
+ "roc_auc": 0.8674606454020767,
133
+ "average_precision": 0.4960554729220717
134
  },
135
  "validation_optimal_threshold": {
136
+ "threshold": 0.36023362771573536,
137
+ "accuracy": 0.8627639155470249,
138
+ "precision": 0.37950138504155123,
139
+ "recall": 0.6884422110552764,
140
+ "f1": 0.48928571428571427,
141
  "confusion_matrix": [
142
  [
143
+ 1661,
144
+ 224
145
  ],
146
  [
147
+ 62,
148
+ 137
149
  ]
150
  ],
151
  "classification_report": {
152
  "NOT_RELEVANT": {
153
+ "precision": 0.9640162507254788,
154
+ "recall": 0.8811671087533156,
155
+ "f1-score": 0.9207317073170732,
156
+ "support": 1885.0
157
  },
158
  "RELEVANT": {
159
+ "precision": 0.37950138504155123,
160
+ "recall": 0.6884422110552764,
161
+ "f1-score": 0.48928571428571427,
162
+ "support": 199.0
163
  },
164
+ "accuracy": 0.8627639155470249,
165
  "macro avg": {
166
+ "precision": 0.671758817883515,
167
+ "recall": 0.7848046599042959,
168
+ "f1-score": 0.7050087108013937,
169
+ "support": 2084.0
170
  },
171
  "weighted avg": {
172
+ "precision": 0.9082012515550846,
173
+ "recall": 0.8627639155470249,
174
+ "f1-score": 0.879533169594789,
175
+ "support": 2084.0
176
  }
177
  },
178
+ "roc_auc": 0.8674606454020767,
179
+ "average_precision": 0.4960554729220717
180
  },
181
  "test_default_0_5": {
182
  "threshold": 0.5,
183
+ "accuracy": 0.9175455417066155,
184
+ "precision": 0.3576158940397351,
185
+ "recall": 0.4186046511627907,
186
+ "f1": 0.38571428571428573,
187
  "confusion_matrix": [
188
  [
189
+ 1860,
190
+ 97
191
  ],
192
  [
193
+ 75,
194
+ 54
195
  ]
196
  ],
197
  "classification_report": {
198
  "NOT_RELEVANT": {
199
+ "precision": 0.9612403100775194,
200
+ "recall": 0.9504343382728666,
201
+ "f1-score": 0.9558067831449126,
202
+ "support": 1957.0
203
  },
204
  "RELEVANT": {
205
+ "precision": 0.3576158940397351,
206
+ "recall": 0.4186046511627907,
207
+ "f1-score": 0.38571428571428573,
208
+ "support": 129.0
209
  },
210
+ "accuracy": 0.9175455417066155,
211
  "macro avg": {
212
+ "precision": 0.6594281020586272,
213
+ "recall": 0.6845194947178287,
214
+ "f1-score": 0.6707605344295992,
215
+ "support": 2086.0
216
  },
217
  "weighted avg": {
218
+ "precision": 0.9239116668997274,
219
+ "recall": 0.9175455417066155,
220
+ "f1-score": 0.9205517821053388,
221
+ "support": 2086.0
222
  }
223
  },
224
+ "roc_auc": 0.8562663149180244,
225
+ "average_precision": 0.3981049318082447
226
  },
227
  "test_optimal_threshold": {
228
+ "threshold": 0.36023362771573536,
229
+ "accuracy": 0.8686481303930969,
230
+ "precision": 0.26688102893890675,
231
+ "recall": 0.6434108527131783,
232
+ "f1": 0.37727272727272726,
233
  "confusion_matrix": [
234
  [
235
+ 1729,
236
+ 228
237
  ],
238
  [
239
+ 46,
240
+ 83
241
  ]
242
  ],
243
  "classification_report": {
244
  "NOT_RELEVANT": {
245
+ "precision": 0.9740845070422535,
246
+ "recall": 0.883495145631068,
247
+ "f1-score": 0.9265809217577706,
248
+ "support": 1957.0
249
  },
250
  "RELEVANT": {
251
+ "precision": 0.26688102893890675,
252
+ "recall": 0.6434108527131783,
253
+ "f1-score": 0.37727272727272726,
254
+ "support": 129.0
255
  },
256
+ "accuracy": 0.8686481303930969,
257
  "macro avg": {
258
+ "precision": 0.6204827679905801,
259
+ "recall": 0.7634529991721232,
260
+ "f1-score": 0.651926824515249,
261
+ "support": 2086.0
262
  },
263
  "weighted avg": {
264
+ "precision": 0.9303504472745969,
265
+ "recall": 0.8686481303930969,
266
+ "f1-score": 0.8926112395484846,
267
+ "support": 2086.0
268
  }
269
  },
270
+ "roc_auc": 0.8562663149180244,
271
+ "average_precision": 0.3981049318082447
272
  }
273
  },
274
  {
 
277
  "artifact_dir": "/content/agri-utilization-classifier/baselines/xgboost",
278
  "artifact_file": "/content/agri-utilization-classifier/baselines/xgboost/xgboost_tfidf.joblib",
279
  "validation_best_threshold": {
280
+ "threshold": 0.10415865480899811,
281
+ "f1": 0.5354691075514874,
282
+ "precision": 0.49159663865546216,
283
+ "recall": 0.5879396984924623
284
  },
285
  "validation_default_0_5": {
286
  "threshold": 0.5,
287
+ "accuracy": 0.9189059500959693,
288
+ "precision": 0.7205882352941176,
289
+ "recall": 0.24623115577889448,
290
+ "f1": 0.36704119850187267,
291
  "confusion_matrix": [
292
  [
293
+ 1866,
294
+ 19
295
  ],
296
  [
297
+ 150,
298
+ 49
299
  ]
300
  ],
301
  "classification_report": {
302
  "NOT_RELEVANT": {
303
+ "precision": 0.9255952380952381,
304
+ "recall": 0.9899204244031831,
305
+ "f1-score": 0.9566777749295052,
306
+ "support": 1885.0
307
  },
308
  "RELEVANT": {
309
+ "precision": 0.7205882352941176,
310
+ "recall": 0.24623115577889448,
311
+ "f1-score": 0.36704119850187267,
312
+ "support": 199.0
313
  },
314
+ "accuracy": 0.9189059500959693,
315
  "macro avg": {
316
+ "precision": 0.823091736694678,
317
+ "recall": 0.6180757900910387,
318
+ "f1-score": 0.661859486715689,
319
+ "support": 2084.0
320
  },
321
  "weighted avg": {
322
+ "precision": 0.9060192335091427,
323
+ "recall": 0.9189059500959693,
324
+ "f1-score": 0.9003737064510509,
325
+ "support": 2084.0
326
  }
327
  },
328
+ "roc_auc": 0.8338682803940125,
329
+ "average_precision": 0.49303491730954946
330
  },
331
  "validation_optimal_threshold": {
332
+ "threshold": 0.10415865480899811,
333
+ "accuracy": 0.9025911708253359,
334
+ "precision": 0.49159663865546216,
335
+ "recall": 0.5879396984924623,
336
+ "f1": 0.5354691075514875,
337
  "confusion_matrix": [
338
  [
339
+ 1764,
340
+ 121
341
  ],
342
  [
343
+ 82,
344
+ 117
345
  ]
346
  ],
347
  "classification_report": {
348
  "NOT_RELEVANT": {
349
+ "precision": 0.9555796316359697,
350
+ "recall": 0.9358090185676392,
351
+ "f1-score": 0.9455909943714822,
352
+ "support": 1885.0
353
  },
354
  "RELEVANT": {
355
+ "precision": 0.49159663865546216,
356
+ "recall": 0.5879396984924623,
357
+ "f1-score": 0.5354691075514875,
358
+ "support": 199.0
359
  },
360
+ "accuracy": 0.9025911708253359,
361
  "macro avg": {
362
+ "precision": 0.723588135145716,
363
+ "recall": 0.7618743585300507,
364
+ "f1-score": 0.7405300509614848,
365
+ "support": 2084.0
366
  },
367
  "weighted avg": {
368
+ "precision": 0.9112741538993474,
369
+ "recall": 0.9025911708253359,
370
+ "f1-score": 0.906428683681857,
371
+ "support": 2084.0
372
  }
373
  },
374
+ "roc_auc": 0.8338682803940125,
375
+ "average_precision": 0.49303491730954946
376
  },
377
  "test_default_0_5": {
378
  "threshold": 0.5,
379
+ "accuracy": 0.950143815915628,
380
+ "precision": 0.7659574468085106,
381
+ "recall": 0.27906976744186046,
382
+ "f1": 0.4090909090909091,
383
  "confusion_matrix": [
384
  [
385
+ 1946,
386
+ 11
387
  ],
388
  [
389
+ 93,
390
+ 36
391
  ]
392
  ],
393
  "classification_report": {
394
  "NOT_RELEVANT": {
395
+ "precision": 0.9543894065718489,
396
+ "recall": 0.9943791517629024,
397
+ "f1-score": 0.973973973973974,
398
+ "support": 1957.0
399
  },
400
  "RELEVANT": {
401
+ "precision": 0.7659574468085106,
402
+ "recall": 0.27906976744186046,
403
+ "f1-score": 0.4090909090909091,
404
+ "support": 129.0
405
  },
406
+ "accuracy": 0.950143815915628,
407
  "macro avg": {
408
+ "precision": 0.8601734266901797,
409
+ "recall": 0.6367244596023814,
410
+ "f1-score": 0.6915324415324415,
411
+ "support": 2086.0
412
  },
413
  "weighted avg": {
414
+ "precision": 0.9427366151962637,
415
+ "recall": 0.950143815915628,
416
+ "f1-score": 0.9390411286384441,
417
+ "support": 2086.0
418
  }
419
  },
420
+ "roc_auc": 0.8210657033190336,
421
+ "average_precision": 0.47107353711983707
422
  },
423
  "test_optimal_threshold": {
424
+ "threshold": 0.10415865480899811,
425
+ "accuracy": 0.9065196548418025,
426
+ "precision": 0.34285714285714286,
427
+ "recall": 0.5581395348837209,
428
+ "f1": 0.4247787610619469,
429
  "confusion_matrix": [
430
  [
431
+ 1819,
432
+ 138
433
  ],
434
  [
435
+ 57,
436
+ 72
437
  ]
438
  ],
439
  "classification_report": {
440
  "NOT_RELEVANT": {
441
+ "precision": 0.9696162046908315,
442
+ "recall": 0.9294839039345938,
443
+ "f1-score": 0.9491260109574745,
444
+ "support": 1957.0
445
  },
446
  "RELEVANT": {
447
+ "precision": 0.34285714285714286,
448
+ "recall": 0.5581395348837209,
449
+ "f1-score": 0.4247787610619469,
450
+ "support": 129.0
451
  },
452
+ "accuracy": 0.9065196548418025,
453
  "macro avg": {
454
+ "precision": 0.6562366737739872,
455
+ "recall": 0.7438117194091574,
456
+ "f1-score": 0.6869523860097106,
457
+ "support": 2086.0
458
  },
459
  "weighted avg": {
460
+ "precision": 0.9308568954978566,
461
+ "recall": 0.9065196548418025,
462
+ "f1-score": 0.9166999346216532,
463
+ "support": 2086.0
464
  }
465
  },
466
+ "roc_auc": 0.8210657033190336,
467
+ "average_precision": 0.47107353711983707
468
  }
469
  },
470
  {
 
474
  "artifact_dir": "/content/agri-utilization-classifier/baselines/embedding-logistic",
475
  "artifact_file": "/content/agri-utilization-classifier/baselines/embedding-logistic/embedding-logistic.joblib",
476
  "validation_best_threshold": {
477
+ "threshold": 0.7262281775474548,
478
+ "f1": 0.6881720430107527,
479
+ "precision": 0.6015037593984962,
480
+ "recall": 0.8040201005025126
481
  },
482
  "validation_default_0_5": {
483
  "threshold": 0.5,
484
+ "accuracy": 0.8953934740882917,
485
+ "precision": 0.473972602739726,
486
+ "recall": 0.8693467336683417,
487
+ "f1": 0.6134751773049646,
488
  "confusion_matrix": [
489
  [
490
+ 1693,
491
+ 192
492
  ],
493
  [
494
+ 26,
495
+ 173
496
  ]
497
  ],
498
  "classification_report": {
499
  "NOT_RELEVANT": {
500
+ "precision": 0.9848749272833043,
501
+ "recall": 0.8981432360742706,
502
+ "f1-score": 0.939511653718091,
503
+ "support": 1885.0
504
  },
505
  "RELEVANT": {
506
+ "precision": 0.473972602739726,
507
+ "recall": 0.8693467336683417,
508
+ "f1-score": 0.6134751773049646,
509
+ "support": 199.0
510
  },
511
+ "accuracy": 0.8953934740882917,
512
  "macro avg": {
513
+ "precision": 0.7294237650115152,
514
+ "recall": 0.8837449848713061,
515
+ "f1-score": 0.7764934155115277,
516
+ "support": 2084.0
517
  },
518
  "weighted avg": {
519
+ "precision": 0.9360891486920508,
520
+ "recall": 0.8953934740882917,
521
+ "f1-score": 0.9083786120644383,
522
+ "support": 2084.0
523
  }
524
  },
525
+ "roc_auc": 0.951833437745758,
526
+ "average_precision": 0.6518934466014814
527
  },
528
  "validation_optimal_threshold": {
529
+ "threshold": 0.7262281775474548,
530
+ "accuracy": 0.9304222648752399,
531
+ "precision": 0.6015037593984962,
532
+ "recall": 0.8040201005025126,
533
+ "f1": 0.6881720430107527,
534
  "confusion_matrix": [
535
  [
536
+ 1779,
537
+ 106
538
  ],
539
  [
540
+ 39,
541
+ 160
542
  ]
543
  ],
544
  "classification_report": {
545
  "NOT_RELEVANT": {
546
+ "precision": 0.9785478547854786,
547
+ "recall": 0.9437665782493369,
548
+ "f1-score": 0.9608425600864164,
549
+ "support": 1885.0
550
  },
551
  "RELEVANT": {
552
+ "precision": 0.6015037593984962,
553
+ "recall": 0.8040201005025126,
554
+ "f1-score": 0.6881720430107527,
555
+ "support": 199.0
556
  },
557
+ "accuracy": 0.9304222648752399,
558
  "macro avg": {
559
+ "precision": 0.7900258070919874,
560
+ "recall": 0.8738933393759247,
561
+ "f1-score": 0.8245073015485846,
562
+ "support": 2084.0
563
  },
564
  "weighted avg": {
565
+ "precision": 0.9425441239879693,
566
+ "recall": 0.9304222648752399,
567
+ "f1-score": 0.9348054041852374,
568
+ "support": 2084.0
569
  }
570
  },
571
+ "roc_auc": 0.951833437745758,
572
+ "average_precision": 0.6518934466014814
573
  },
574
  "test_default_0_5": {
575
  "threshold": 0.5,
576
+ "accuracy": 0.8911792905081496,
577
+ "precision": 0.3496932515337423,
578
+ "recall": 0.8837209302325582,
579
+ "f1": 0.5010989010989011,
580
  "confusion_matrix": [
581
  [
582
+ 1745,
583
+ 212
584
  ],
585
  [
586
+ 15,
587
+ 114
588
  ]
589
  ],
590
  "classification_report": {
591
  "NOT_RELEVANT": {
592
+ "precision": 0.9914772727272727,
593
+ "recall": 0.8916709248850281,
594
+ "f1-score": 0.9389292440139898,
595
+ "support": 1957.0
596
  },
597
  "RELEVANT": {
598
+ "precision": 0.3496932515337423,
599
+ "recall": 0.8837209302325582,
600
+ "f1-score": 0.5010989010989011,
601
+ "support": 129.0
602
  },
603
+ "accuracy": 0.8911792905081496,
604
  "macro avg": {
605
+ "precision": 0.6705852621305075,
606
+ "recall": 0.8876959275587931,
607
+ "f1-score": 0.7200140725564455,
608
+ "support": 2086.0
609
  },
610
  "weighted avg": {
611
+ "precision": 0.9517888073706259,
612
+ "recall": 0.8911792905081496,
613
+ "f1-score": 0.911853446201887,
614
+ "support": 2086.0
615
  }
616
  },
617
+ "roc_auc": 0.9507987625419385,
618
+ "average_precision": 0.5430515103343022
619
  },
620
  "test_optimal_threshold": {
621
+ "threshold": 0.7262281775474548,
622
+ "accuracy": 0.9285714285714286,
623
+ "precision": 0.4494949494949495,
624
+ "recall": 0.689922480620155,
625
+ "f1": 0.5443425076452599,
626
  "confusion_matrix": [
627
  [
628
+ 1848,
629
+ 109
630
  ],
631
  [
632
+ 40,
633
+ 89
634
  ]
635
  ],
636
  "classification_report": {
637
  "NOT_RELEVANT": {
638
+ "precision": 0.9788135593220338,
639
+ "recall": 0.9443025038323966,
640
+ "f1-score": 0.9612483745123537,
641
+ "support": 1957.0
642
  },
643
  "RELEVANT": {
644
+ "precision": 0.4494949494949495,
645
+ "recall": 0.689922480620155,
646
+ "f1-score": 0.5443425076452599,
647
+ "support": 129.0
648
  },
649
+ "accuracy": 0.9285714285714286,
650
  "macro avg": {
651
+ "precision": 0.7141542544084917,
652
+ "recall": 0.8171124922262758,
653
+ "f1-score": 0.7527954410788068,
654
+ "support": 2086.0
655
  },
656
  "weighted avg": {
657
+ "precision": 0.9460800498936092,
658
+ "recall": 0.9285714285714286,
659
+ "f1-score": 0.9354665639534586,
660
+ "support": 2086.0
661
  }
662
  },
663
+ "roc_auc": 0.9507987625419385,
664
+ "average_precision": 0.5430515103343022
665
  }
666
  },
667
  {
 
671
  "artifact_dir": "/content/agri-utilization-classifier/baselines/embedding-svm",
672
  "artifact_file": "/content/agri-utilization-classifier/baselines/embedding-svm/embedding-svm.joblib",
673
  "validation_best_threshold": {
674
+ "threshold": 0.24490298824052611,
675
+ "f1": 0.7004608294930875,
676
+ "precision": 0.6468085106382979,
677
+ "recall": 0.7638190954773869
678
  },
679
  "validation_default_0_5": {
680
  "threshold": 0.5,
681
+ "accuracy": 0.9313819577735125,
682
+ "precision": 0.7121212121212122,
683
+ "recall": 0.4723618090452261,
684
+ "f1": 0.56797583081571,
685
  "confusion_matrix": [
686
  [
687
+ 1847,
688
+ 38
689
  ],
690
  [
691
+ 105,
692
+ 94
693
  ]
694
  ],
695
  "classification_report": {
696
  "NOT_RELEVANT": {
697
+ "precision": 0.9462090163934426,
698
+ "recall": 0.979840848806366,
699
+ "f1-score": 0.9627313004951785,
700
+ "support": 1885.0
701
  },
702
  "RELEVANT": {
703
+ "precision": 0.7121212121212122,
704
+ "recall": 0.4723618090452261,
705
+ "f1-score": 0.56797583081571,
706
+ "support": 199.0
707
  },
708
+ "accuracy": 0.9313819577735125,
709
  "macro avg": {
710
+ "precision": 0.8291651142573273,
711
+ "recall": 0.726101328925796,
712
+ "f1-score": 0.7653535656554442,
713
+ "support": 2084.0
714
  },
715
  "weighted avg": {
716
+ "precision": 0.9238561022618813,
717
+ "recall": 0.9313819577735125,
718
+ "f1-score": 0.9250363204250182,
719
+ "support": 2084.0
720
  }
721
  },
722
+ "roc_auc": 0.9535662396864961,
723
+ "average_precision": 0.6698106173212027
724
  },
725
  "validation_optimal_threshold": {
726
+ "threshold": 0.24490298824052611,
727
+ "accuracy": 0.9376199616122841,
728
+ "precision": 0.6468085106382979,
729
+ "recall": 0.7638190954773869,
730
+ "f1": 0.7004608294930875,
731
  "confusion_matrix": [
732
  [
733
+ 1802,
734
+ 83
735
  ],
736
  [
737
+ 47,
738
+ 152
739
  ]
740
  ],
741
  "classification_report": {
742
  "NOT_RELEVANT": {
743
+ "precision": 0.9745808545159546,
744
+ "recall": 0.9559681697612732,
745
+ "f1-score": 0.9651847884306374,
746
+ "support": 1885.0
747
  },
748
  "RELEVANT": {
749
+ "precision": 0.6468085106382979,
750
+ "recall": 0.7638190954773869,
751
+ "f1-score": 0.7004608294930875,
752
+ "support": 199.0
753
  },
754
+ "accuracy": 0.9376199616122841,
755
  "macro avg": {
756
+ "precision": 0.8106946825771262,
757
+ "recall": 0.85989363261933,
758
+ "f1-score": 0.8328228089618624,
759
+ "support": 2084.0
760
  },
761
  "weighted avg": {
762
+ "precision": 0.9432820558443357,
763
+ "recall": 0.9376199616122841,
764
+ "f1-score": 0.9399064449428386,
765
+ "support": 2084.0
766
  }
767
  },
768
+ "roc_auc": 0.9535662396864961,
769
+ "average_precision": 0.6698106173212027
770
  },
771
  "test_default_0_5": {
772
  "threshold": 0.5,
773
+ "accuracy": 0.9482262703739214,
774
+ "precision": 0.6060606060606061,
775
+ "recall": 0.46511627906976744,
776
+ "f1": 0.5263157894736842,
777
  "confusion_matrix": [
778
  [
779
+ 1918,
780
+ 39
781
  ],
782
  [
783
+ 69,
784
+ 60
785
  ]
786
  ],
787
  "classification_report": {
788
  "NOT_RELEVANT": {
789
+ "precision": 0.96527428283845,
790
+ "recall": 0.9800715380684721,
791
+ "f1-score": 0.9726166328600405,
792
+ "support": 1957.0
793
  },
794
  "RELEVANT": {
795
+ "precision": 0.6060606060606061,
796
+ "recall": 0.46511627906976744,
797
+ "f1-score": 0.5263157894736842,
798
+ "support": 129.0
799
  },
800
+ "accuracy": 0.9482262703739214,
801
  "macro avg": {
802
+ "precision": 0.785667444449528,
803
+ "recall": 0.7225939085691198,
804
+ "f1-score": 0.7494662111668624,
805
+ "support": 2086.0
806
  },
807
  "weighted avg": {
808
+ "precision": 0.9430602059907309,
809
+ "recall": 0.9482262703739214,
810
+ "f1-score": 0.9450170121520635,
811
+ "support": 2086.0
812
  }
813
  },
814
+ "roc_auc": 0.9554768610394807,
815
+ "average_precision": 0.5656727259491919
816
  },
817
  "test_optimal_threshold": {
818
+ "threshold": 0.24490298824052611,
819
+ "accuracy": 0.9372003835091084,
820
+ "precision": 0.4943181818181818,
821
+ "recall": 0.6744186046511628,
822
+ "f1": 0.5704918032786885,
823
  "confusion_matrix": [
824
  [
825
+ 1868,
826
+ 89
827
  ],
828
  [
829
+ 42,
830
+ 87
831
  ]
832
  ],
833
  "classification_report": {
834
  "NOT_RELEVANT": {
835
+ "precision": 0.9780104712041885,
836
+ "recall": 0.9545222278998468,
837
+ "f1-score": 0.9661236100336178,
838
+ "support": 1957.0
839
  },
840
  "RELEVANT": {
841
+ "precision": 0.4943181818181818,
842
+ "recall": 0.6744186046511628,
843
+ "f1-score": 0.5704918032786885,
844
+ "support": 129.0
845
  },
846
+ "accuracy": 0.9372003835091084,
847
  "macro avg": {
848
+ "precision": 0.7361643265111851,
849
+ "recall": 0.8144704162755048,
850
+ "f1-score": 0.7683077066561532,
851
+ "support": 2086.0
852
  },
853
  "weighted avg": {
854
+ "precision": 0.9480985319276809,
855
+ "recall": 0.9372003835091084,
856
+ "f1-score": 0.9416574053014098,
857
+ "support": 2086.0
858
  }
859
  },
860
+ "roc_auc": 0.9554768610394807,
861
+ "average_precision": 0.5656727259491919
862
  }
863
  },
864
  {
 
868
  "artifact_dir": "/content/agri-utilization-classifier/baselines/embedding-lightgbm",
869
  "artifact_file": "/content/agri-utilization-classifier/baselines/embedding-lightgbm/embedding-lightgbm.joblib",
870
  "validation_best_threshold": {
871
+ "threshold": 0.08937255699326424,
872
+ "f1": 0.7008547008547008,
873
+ "precision": 0.6096654275092936,
874
+ "recall": 0.8241206030150754
875
  },
876
  "validation_default_0_5": {
877
  "threshold": 0.5,
878
+ "accuracy": 0.9366602687140115,
879
+ "precision": 0.6810810810810811,
880
+ "recall": 0.6331658291457286,
881
+ "f1": 0.65625,
882
  "confusion_matrix": [
883
  [
884
+ 1826,
885
+ 59
886
  ],
887
  [
888
+ 73,
889
+ 126
890
  ]
891
  ],
892
  "classification_report": {
893
  "NOT_RELEVANT": {
894
+ "precision": 0.9615587151132174,
895
+ "recall": 0.9687002652519894,
896
+ "f1-score": 0.9651162790697675,
897
+ "support": 1885.0
898
  },
899
  "RELEVANT": {
900
+ "precision": 0.6810810810810811,
901
+ "recall": 0.6331658291457286,
902
+ "f1-score": 0.65625,
903
+ "support": 199.0
904
  },
905
+ "accuracy": 0.9366602687140115,
906
  "macro avg": {
907
+ "precision": 0.8213198980971492,
908
+ "recall": 0.8009330471988589,
909
+ "f1-score": 0.8106831395348837,
910
+ "support": 2084.0
911
  },
912
  "weighted avg": {
913
+ "precision": 0.9347760619594769,
914
+ "recall": 0.9366602687140115,
915
+ "f1-score": 0.9356228100031246,
916
+ "support": 2084.0
917
  }
918
  },
919
+ "roc_auc": 0.9537208589365928,
920
+ "average_precision": 0.6689450933330522
921
  },
922
  "validation_optimal_threshold": {
923
+ "threshold": 0.08937255699326424,
924
+ "accuracy": 0.9328214971209213,
925
+ "precision": 0.6096654275092936,
926
+ "recall": 0.8241206030150754,
927
+ "f1": 0.7008547008547008,
928
  "confusion_matrix": [
929
  [
930
+ 1780,
931
+ 105
932
  ],
933
  [
934
+ 35,
935
+ 164
936
  ]
937
  ],
938
  "classification_report": {
939
  "NOT_RELEVANT": {
940
+ "precision": 0.9807162534435262,
941
+ "recall": 0.9442970822281167,
942
+ "f1-score": 0.9621621621621622,
943
+ "support": 1885.0
944
  },
945
  "RELEVANT": {
946
+ "precision": 0.6096654275092936,
947
+ "recall": 0.8241206030150754,
948
+ "f1-score": 0.7008547008547008,
949
+ "support": 199.0
950
  },
951
+ "accuracy": 0.9328214971209213,
952
  "macro avg": {
953
+ "precision": 0.7951908404764099,
954
+ "recall": 0.884208842621596,
955
+ "f1-score": 0.8315084315084316,
956
+ "support": 2084.0
957
  },
958
  "weighted avg": {
959
+ "precision": 0.945284816610075,
960
+ "recall": 0.9328214971209213,
961
+ "f1-score": 0.9372100581313634,
962
+ "support": 2084.0
963
  }
964
  },
965
+ "roc_auc": 0.9537208589365928,
966
+ "average_precision": 0.6689450933330522
967
  },
968
  "test_default_0_5": {
969
  "threshold": 0.5,
970
+ "accuracy": 0.9482262703739214,
971
+ "precision": 0.5789473684210527,
972
+ "recall": 0.5968992248062015,
973
+ "f1": 0.5877862595419847,
974
  "confusion_matrix": [
975
  [
976
+ 1901,
977
+ 56
978
  ],
979
  [
980
+ 52,
981
+ 77
982
  ]
983
  ],
984
  "classification_report": {
985
  "NOT_RELEVANT": {
986
+ "precision": 0.9733742959549411,
987
+ "recall": 0.9713847726111395,
988
+ "f1-score": 0.972378516624041,
989
+ "support": 1957.0
990
  },
991
  "RELEVANT": {
992
+ "precision": 0.5789473684210527,
993
+ "recall": 0.5968992248062015,
994
+ "f1-score": 0.5877862595419847,
995
+ "support": 129.0
996
  },
997
+ "accuracy": 0.9482262703739214,
998
  "macro avg": {
999
+ "precision": 0.7761608321879969,
1000
+ "recall": 0.7841419987086705,
1001
+ "f1-score": 0.7800823880830128,
1002
+ "support": 2086.0
1003
  },
1004
  "weighted avg": {
1005
+ "precision": 0.948982601970343,
1006
+ "recall": 0.9482262703739214,
1007
+ "f1-score": 0.9485950069578927,
1008
+ "support": 2086.0
1009
  }
1010
  },
1011
+ "roc_auc": 0.9484498104597687,
1012
+ "average_precision": 0.5852438412769653
1013
  },
1014
  "test_optimal_threshold": {
1015
+ "threshold": 0.08937255699326424,
1016
+ "accuracy": 0.9324065196548418,
1017
+ "precision": 0.4716981132075472,
1018
+ "recall": 0.7751937984496124,
1019
+ "f1": 0.5865102639296188,
1020
  "confusion_matrix": [
1021
  [
1022
+ 1845,
1023
+ 112
1024
  ],
1025
  [
1026
+ 29,
1027
+ 100
1028
  ]
1029
  ],
1030
  "classification_report": {
1031
  "NOT_RELEVANT": {
1032
+ "precision": 0.9845250800426895,
1033
+ "recall": 0.942769545222279,
1034
+ "f1-score": 0.9631949882537196,
1035
+ "support": 1957.0
1036
  },
1037
  "RELEVANT": {
1038
+ "precision": 0.4716981132075472,
1039
+ "recall": 0.7751937984496124,
1040
+ "f1-score": 0.5865102639296188,
1041
+ "support": 129.0
1042
  },
1043
+ "accuracy": 0.9324065196548418,
1044
  "macro avg": {
1045
+ "precision": 0.7281115966251184,
1046
+ "recall": 0.8589816718359458,
1047
+ "f1-score": 0.7748526260916693,
1048
+ "support": 2086.0
1049
  },
1050
  "weighted avg": {
1051
+ "precision": 0.9528114277312162,
1052
+ "recall": 0.9324065196548418,
1053
+ "f1-score": 0.9399004870850672,
1054
+ "support": 2086.0
1055
  }
1056
  },
1057
+ "roc_auc": 0.9484498104597687,
1058
+ "average_precision": 0.5852438412769653
1059
  }
1060
  },
1061
  {
 
1063
  "model_name": "FacebookAI/xlm-roberta-base",
1064
  "artifact_dir": "/content/agri-utilization-classifier/transformer",
1065
  "validation_best_threshold": {
1066
+ "threshold": 0.5436205267906189,
1067
+ "f1": 0.6983372921615203,
1068
+ "precision": 0.6621621621621622,
1069
+ "recall": 0.7386934673366834
1070
  },
1071
  "validation_default_0_5": {
1072
  "threshold": 0.5,
1073
+ "accuracy": 0.9376199616122841,
1074
+ "precision": 0.6533333333333333,
1075
+ "recall": 0.7386934673366834,
1076
+ "f1": 0.6933962264150944,
1077
  "confusion_matrix": [
1078
  [
1079
+ 1807,
1080
+ 78
1081
  ],
1082
  [
1083
+ 52,
1084
+ 147
1085
  ]
1086
  ],
1087
  "classification_report": {
1088
  "NOT_RELEVANT": {
1089
+ "precision": 0.972027972027972,
1090
+ "recall": 0.9586206896551724,
1091
+ "f1-score": 0.9652777777777778,
1092
+ "support": 1885.0
1093
  },
1094
  "RELEVANT": {
1095
+ "precision": 0.6533333333333333,
1096
+ "recall": 0.7386934673366834,
1097
+ "f1-score": 0.6933962264150944,
1098
+ "support": 199.0
1099
  },
1100
+ "accuracy": 0.9376199616122841,
1101
  "macro avg": {
1102
+ "precision": 0.8126806526806527,
1103
+ "recall": 0.848657078495928,
1104
+ "f1-score": 0.8293370020964361,
1105
+ "support": 2084.0
1106
  },
1107
  "weighted avg": {
1108
+ "precision": 0.9415959983714303,
1109
+ "recall": 0.9376199616122841,
1110
+ "f1-score": 0.9393159597733756,
1111
+ "support": 2084.0
1112
  }
1113
  },
1114
+ "roc_auc": 0.9544433040534235,
1115
+ "average_precision": 0.7261778718910262
1116
  },
1117
  "validation_optimal_threshold": {
1118
+ "threshold": 0.5436205267906189,
1119
+ "accuracy": 0.9390595009596929,
1120
+ "precision": 0.6621621621621622,
1121
+ "recall": 0.7386934673366834,
1122
+ "f1": 0.6983372921615202,
1123
  "confusion_matrix": [
1124
  [
1125
+ 1810,
1126
+ 75
1127
  ],
1128
  [
1129
+ 52,
1130
+ 147
1131
  ]
1132
  ],
1133
  "classification_report": {
1134
  "NOT_RELEVANT": {
1135
+ "precision": 0.9720730397422127,
1136
+ "recall": 0.9602122015915119,
1137
+ "f1-score": 0.9661062183079797,
1138
+ "support": 1885.0
1139
  },
1140
  "RELEVANT": {
1141
+ "precision": 0.6621621621621622,
1142
+ "recall": 0.7386934673366834,
1143
+ "f1-score": 0.6983372921615202,
1144
+ "support": 199.0
1145
  },
1146
+ "accuracy": 0.9390595009596929,
1147
  "macro avg": {
1148
+ "precision": 0.8171176009521874,
1149
+ "recall": 0.8494528344640977,
1150
+ "f1-score": 0.83222175523475,
1151
+ "support": 2084.0
1152
  },
1153
  "weighted avg": {
1154
+ "precision": 0.9424798225452693,
1155
+ "recall": 0.9390595009596929,
1156
+ "f1-score": 0.9405371125962977,
1157
+ "support": 2084.0
1158
  }
1159
  },
1160
+ "roc_auc": 0.9544433040534235,
1161
+ "average_precision": 0.7261778718910262
1162
  },
1163
  "test_default_0_5": {
1164
  "threshold": 0.5,
1165
+ "accuracy": 0.9429530201342282,
1166
+ "precision": 0.532051282051282,
1167
+ "recall": 0.6434108527131783,
1168
+ "f1": 0.5824561403508772,
1169
  "confusion_matrix": [
1170
  [
1171
+ 1884,
1172
+ 73
1173
  ],
1174
  [
1175
+ 46,
1176
+ 83
1177
  ]
1178
  ],
1179
  "classification_report": {
1180
  "NOT_RELEVANT": {
1181
+ "precision": 0.9761658031088083,
1182
+ "recall": 0.9626980071538068,
1183
+ "f1-score": 0.969385129920247,
1184
+ "support": 1957.0
1185
  },
1186
  "RELEVANT": {
1187
+ "precision": 0.532051282051282,
1188
+ "recall": 0.6434108527131783,
1189
+ "f1-score": 0.5824561403508772,
1190
+ "support": 129.0
1191
  },
1192
+ "accuracy": 0.9429530201342282,
1193
  "macro avg": {
1194
+ "precision": 0.7541085425800451,
1195
+ "recall": 0.8030544299334925,
1196
+ "f1-score": 0.7759206351355621,
1197
+ "support": 2086.0
1198
  },
1199
  "weighted avg": {
1200
+ "precision": 0.9487013864182902,
1201
+ "recall": 0.9429530201342282,
1202
+ "f1-score": 0.9454571147455353,
1203
+ "support": 2086.0
1204
  }
1205
  },
1206
+ "roc_auc": 0.9314030730472604,
1207
+ "average_precision": 0.4996827419378796
1208
  },
1209
  "test_optimal_threshold": {
1210
+ "threshold": 0.5436205267906189,
1211
+ "accuracy": 0.9424736337488016,
1212
+ "precision": 0.5290322580645161,
1213
+ "recall": 0.6356589147286822,
1214
+ "f1": 0.5774647887323944,
1215
  "confusion_matrix": [
1216
  [
1217
+ 1884,
1218
+ 73
1219
  ],
1220
  [
1221
+ 47,
1222
+ 82
1223
  ]
1224
  ],
1225
  "classification_report": {
1226
  "NOT_RELEVANT": {
1227
+ "precision": 0.9756602796478508,
1228
+ "recall": 0.9626980071538068,
1229
+ "f1-score": 0.9691358024691358,
1230
+ "support": 1957.0
1231
  },
1232
  "RELEVANT": {
1233
+ "precision": 0.5290322580645161,
1234
+ "recall": 0.6356589147286822,
1235
+ "f1-score": 0.5774647887323944,
1236
+ "support": 129.0
1237
  },
1238
+ "accuracy": 0.9424736337488016,
1239
  "macro avg": {
1240
+ "precision": 0.7523462688561835,
1241
+ "recall": 0.7991784609412445,
1242
+ "f1-score": 0.7733002956007651,
1243
+ "support": 2086.0
1244
  },
1245
  "weighted avg": {
1246
+ "precision": 0.948040425964126,
1247
+ "recall": 0.9424736337488016,
1248
+ "f1-score": 0.944914536518973,
1249
+ "support": 2086.0
1250
  }
1251
  },
1252
+ "roc_auc": 0.9314030730472604,
1253
+ "average_precision": 0.4996827419378796
1254
  }
1255
  }
1256
  ]
transformer/checkpoint-1220/config.json CHANGED
@@ -32,7 +32,7 @@
32
  "position_embedding_type": "absolute",
33
  "problem_type": "single_label_classification",
34
  "tie_word_embeddings": true,
35
- "transformers_version": "5.9.0",
36
  "type_vocab_size": 1,
37
  "use_cache": false,
38
  "vocab_size": 250002
 
32
  "position_embedding_type": "absolute",
33
  "problem_type": "single_label_classification",
34
  "tie_word_embeddings": true,
35
+ "transformers_version": "5.10.2",
36
  "type_vocab_size": 1,
37
  "use_cache": false,
38
  "vocab_size": 250002
transformer/checkpoint-1220/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b248f60ff3e153b28949243967a2debde809912442c1ef5fe19d89dad891f1f9
3
  size 1112205008
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:68cfdbf41602bef3dad433cc9d7341026e15b6212a2de26b87cb1b08c92fe53a
3
  size 1112205008
transformer/checkpoint-1220/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f731780f8bff3652e23ff4cf1692c96c1068919f515c85113ffd987765be34ce
3
  size 2224532875
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b157f469e91a6d944690dc04579dbfd1df4cb7df1828e6737dc78b7100c6c9bd
3
  size 2224532875
transformer/checkpoint-1220/rng_state.pth CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e47023fdf7fee85f2c66207ee2960719b8bf1b11c2d946d75e0d2fe33113c7ce
3
  size 14645
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d539b8d76c8001da486adb60fe77a06b1ad89abe54968b00e74e6fe0a65f76a4
3
  size 14645
transformer/checkpoint-1220/scaler.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f2e11cad5f2deee13f6148971cf1c6ded27d5cbdc725a37902243981a6125a17
3
  size 1383
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f0d6ec924d72ca240bed6d2851cb98eda1484e26fe7cbfd21e9c5ca01087e280
3
  size 1383
transformer/checkpoint-1220/scheduler.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f6adfd2a8e363fb5adf050a01658d698ef3da72d5e9b197063c5e3b6a0fe9333
3
  size 1465
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a9b7df4afd103514fe9d65c0bed1c50f1ab1e4c492ac4aaea616afc5f70bab7e
3
  size 1465
transformer/checkpoint-1220/trainer_state.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "best_global_step": 915,
3
- "best_metric": 0.8220858895705522,
4
- "best_model_checkpoint": "/content/agri-utilization-classifier/transformer/checkpoint-915",
5
- "epoch": 4.0,
6
  "eval_steps": 500,
7
  "global_step": 1220,
8
  "is_hyper_param_search": false,
@@ -10,396 +10,370 @@
10
  "is_world_process_zero": true,
11
  "log_history": [
12
  {
13
- "epoch": 0.08196721311475409,
14
- "grad_norm": 6.055062770843506,
15
- "learning_rate": 3.157894736842105e-06,
16
- "loss": 0.62972900390625,
17
  "step": 25
18
  },
19
  {
20
- "epoch": 0.16393442622950818,
21
- "grad_norm": 10.6914701461792,
22
- "learning_rate": 6.447368421052632e-06,
23
- "loss": 0.44850738525390627,
24
  "step": 50
25
  },
26
  {
27
- "epoch": 0.2459016393442623,
28
- "grad_norm": 6.670228481292725,
29
- "learning_rate": 9.736842105263159e-06,
30
- "loss": 0.3566379165649414,
31
  "step": 75
32
  },
33
  {
34
- "epoch": 0.32786885245901637,
35
- "grad_norm": 2.589911937713623,
36
- "learning_rate": 1.3026315789473684e-05,
37
- "loss": 0.2718839645385742,
38
  "step": 100
39
  },
40
  {
41
- "epoch": 0.4098360655737705,
42
- "grad_norm": 22.02676773071289,
43
- "learning_rate": 1.6315789473684213e-05,
44
- "loss": 0.1922766876220703,
45
  "step": 125
46
  },
47
  {
48
- "epoch": 0.4918032786885246,
49
- "grad_norm": 2.6362855434417725,
50
- "learning_rate": 1.960526315789474e-05,
51
- "loss": 0.1837622833251953,
52
  "step": 150
53
  },
54
  {
55
- "epoch": 0.5737704918032787,
56
- "grad_norm": 3.478484630584717,
57
- "learning_rate": 1.9679533867443555e-05,
58
- "loss": 0.18766048431396484,
59
  "step": 175
60
  },
61
  {
62
- "epoch": 0.6557377049180327,
63
- "grad_norm": 8.077605247497559,
64
- "learning_rate": 1.9315367807720323e-05,
65
- "loss": 0.23830581665039063,
66
  "step": 200
67
  },
68
  {
69
- "epoch": 0.7377049180327869,
70
- "grad_norm": 0.7427046298980713,
71
- "learning_rate": 1.8951201747997088e-05,
72
- "loss": 0.30742517471313474,
73
  "step": 225
74
  },
75
  {
76
- "epoch": 0.819672131147541,
77
- "grad_norm": 36.34975051879883,
78
- "learning_rate": 1.8587035688273852e-05,
79
- "loss": 0.22336017608642578,
80
  "step": 250
81
  },
82
  {
83
- "epoch": 0.9016393442622951,
84
- "grad_norm": 5.215510845184326,
85
- "learning_rate": 1.822286962855062e-05,
86
- "loss": 0.13779294967651368,
87
  "step": 275
88
  },
89
  {
90
- "epoch": 0.9836065573770492,
91
- "grad_norm": 3.551121950149536,
92
- "learning_rate": 1.7858703568827385e-05,
93
- "loss": 0.19200111389160157,
94
  "step": 300
95
  },
96
  {
97
- "epoch": 1.0,
98
- "eval_accuracy": 0.9631901840490797,
99
- "eval_f1": 0.7721518987341772,
100
- "eval_loss": 0.1292734444141388,
101
- "eval_precision": 0.7721518987341772,
102
- "eval_recall": 0.7721518987341772,
103
- "eval_roc_auc": 0.9563720589684741,
104
- "eval_runtime": 3.3396,
105
- "eval_samples_per_second": 292.853,
106
- "eval_steps_per_second": 9.283,
107
- "step": 305
108
- },
109
- {
110
- "epoch": 1.0655737704918034,
111
- "grad_norm": 0.5402449369430542,
112
- "learning_rate": 1.7494537509104153e-05,
113
- "loss": 0.1241053295135498,
114
  "step": 325
115
  },
116
  {
117
- "epoch": 1.1475409836065573,
118
- "grad_norm": 4.476892948150635,
119
- "learning_rate": 1.7130371449380918e-05,
120
- "loss": 0.20724605560302733,
121
  "step": 350
122
  },
123
  {
124
- "epoch": 1.2295081967213115,
125
- "grad_norm": 0.46729782223701477,
126
- "learning_rate": 1.6766205389657686e-05,
127
- "loss": 0.13567353248596192,
128
  "step": 375
129
  },
130
  {
131
- "epoch": 1.3114754098360657,
132
- "grad_norm": 0.1852118819952011,
133
- "learning_rate": 1.640203932993445e-05,
134
- "loss": 0.13295170783996582,
135
  "step": 400
136
  },
137
  {
138
- "epoch": 1.3934426229508197,
139
- "grad_norm": 1.2681413888931274,
140
- "learning_rate": 1.603787327021122e-05,
141
- "loss": 0.2027936363220215,
142
  "step": 425
143
  },
144
  {
145
- "epoch": 1.4754098360655736,
146
- "grad_norm": 7.484091281890869,
147
- "learning_rate": 1.5673707210487983e-05,
148
- "loss": 0.12364128112792969,
149
  "step": 450
150
  },
151
  {
152
- "epoch": 1.5573770491803278,
153
- "grad_norm": 0.46489500999450684,
154
- "learning_rate": 1.530954115076475e-05,
155
- "loss": 0.14407362937927246,
156
  "step": 475
157
  },
158
  {
159
- "epoch": 1.639344262295082,
160
- "grad_norm": 0.20967872440814972,
161
- "learning_rate": 1.4945375091041516e-05,
162
- "loss": 0.12458925247192383,
163
  "step": 500
164
  },
165
  {
166
- "epoch": 1.721311475409836,
167
- "grad_norm": 0.1643747240304947,
168
- "learning_rate": 1.4581209031318282e-05,
169
- "loss": 0.21631996154785157,
170
  "step": 525
171
  },
172
  {
173
- "epoch": 1.8032786885245902,
174
- "grad_norm": 7.073329448699951,
175
- "learning_rate": 1.4217042971595047e-05,
176
- "loss": 0.16043865203857421,
177
  "step": 550
178
  },
179
  {
180
- "epoch": 1.8852459016393444,
181
- "grad_norm": 1.744958758354187,
182
- "learning_rate": 1.3852876911871815e-05,
183
- "loss": 0.0966644287109375,
184
  "step": 575
185
  },
186
  {
187
- "epoch": 1.9672131147540983,
188
- "grad_norm": 12.79035472869873,
189
- "learning_rate": 1.3488710852148582e-05,
190
- "loss": 0.15884541511535644,
191
  "step": 600
192
  },
193
  {
194
- "epoch": 2.0,
195
- "eval_accuracy": 0.9611451942740287,
196
- "eval_f1": 0.7432432432432432,
197
- "eval_loss": 0.13287827372550964,
198
- "eval_precision": 0.7971014492753623,
199
- "eval_recall": 0.6962025316455697,
200
- "eval_roc_auc": 0.9594697343039381,
201
- "eval_runtime": 3.2739,
202
- "eval_samples_per_second": 298.727,
203
- "eval_steps_per_second": 9.469,
204
  "step": 610
205
  },
206
  {
207
- "epoch": 2.0491803278688523,
208
- "grad_norm": 17.520444869995117,
209
- "learning_rate": 1.3124544792425346e-05,
210
- "loss": 0.08896012306213379,
211
  "step": 625
212
  },
213
  {
214
- "epoch": 2.1311475409836067,
215
- "grad_norm": 0.16623224318027496,
216
- "learning_rate": 1.2760378732702113e-05,
217
- "loss": 0.11752216339111328,
218
  "step": 650
219
  },
220
  {
221
- "epoch": 2.2131147540983607,
222
- "grad_norm": 0.20762814581394196,
223
- "learning_rate": 1.239621267297888e-05,
224
- "loss": 0.1193038272857666,
225
  "step": 675
226
  },
227
  {
228
- "epoch": 2.2950819672131146,
229
- "grad_norm": 0.1500111073255539,
230
- "learning_rate": 1.2032046613255645e-05,
231
- "loss": 0.0630855655670166,
232
  "step": 700
233
  },
234
  {
235
- "epoch": 2.3770491803278686,
236
- "grad_norm": 0.17727839946746826,
237
- "learning_rate": 1.1667880553532412e-05,
238
- "loss": 0.08730959892272949,
239
  "step": 725
240
  },
241
  {
242
- "epoch": 2.459016393442623,
243
- "grad_norm": 4.3997321128845215,
244
- "learning_rate": 1.1303714493809176e-05,
245
- "loss": 0.12114215850830078,
246
  "step": 750
247
  },
248
  {
249
- "epoch": 2.540983606557377,
250
- "grad_norm": 34.47224044799805,
251
- "learning_rate": 1.0939548434085944e-05,
252
- "loss": 0.11070786476135254,
253
  "step": 775
254
  },
255
  {
256
- "epoch": 2.6229508196721314,
257
- "grad_norm": 25.977081298828125,
258
- "learning_rate": 1.057538237436271e-05,
259
- "loss": 0.10845686912536621,
260
  "step": 800
261
  },
262
  {
263
- "epoch": 2.7049180327868854,
264
- "grad_norm": 0.1657736450433731,
265
- "learning_rate": 1.0211216314639475e-05,
266
- "loss": 0.1025285530090332,
267
  "step": 825
268
  },
269
  {
270
- "epoch": 2.7868852459016393,
271
- "grad_norm": 34.05498504638672,
272
- "learning_rate": 9.847050254916243e-06,
273
- "loss": 0.07825160026550293,
274
  "step": 850
275
  },
276
  {
277
- "epoch": 2.8688524590163933,
278
- "grad_norm": 0.2868161201477051,
279
- "learning_rate": 9.482884195193008e-06,
280
- "loss": 0.12041816711425782,
281
  "step": 875
282
  },
283
  {
284
- "epoch": 2.9508196721311473,
285
- "grad_norm": 0.19192977249622345,
286
- "learning_rate": 9.118718135469774e-06,
287
- "loss": 0.08709416389465333,
288
  "step": 900
289
  },
290
  {
291
- "epoch": 3.0,
292
- "eval_accuracy": 0.9703476482617587,
293
- "eval_f1": 0.8220858895705522,
294
- "eval_loss": 0.11163181066513062,
295
- "eval_precision": 0.7976190476190477,
296
- "eval_recall": 0.8481012658227848,
297
- "eval_roc_auc": 0.9661086157615353,
298
- "eval_runtime": 3.1733,
299
- "eval_samples_per_second": 308.193,
300
- "eval_steps_per_second": 9.769,
301
- "step": 915
302
- },
303
- {
304
- "epoch": 3.0327868852459017,
305
- "grad_norm": 1.0706992149353027,
306
- "learning_rate": 8.754552075746541e-06,
307
- "loss": 0.10751664161682128,
308
  "step": 925
309
  },
310
  {
311
- "epoch": 3.1147540983606556,
312
- "grad_norm": 0.12844231724739075,
313
- "learning_rate": 8.390386016023307e-06,
314
- "loss": 0.06818144798278808,
315
  "step": 950
316
  },
317
  {
318
- "epoch": 3.19672131147541,
319
- "grad_norm": 0.07692205160856247,
320
- "learning_rate": 8.026219956300074e-06,
321
- "loss": 0.12229555130004882,
322
  "step": 975
323
  },
324
  {
325
- "epoch": 3.278688524590164,
326
- "grad_norm": 1.773990511894226,
327
- "learning_rate": 7.66205389657684e-06,
328
- "loss": 0.06936595916748046,
329
  "step": 1000
330
  },
331
  {
332
- "epoch": 3.360655737704918,
333
- "grad_norm": 0.07844381034374237,
334
- "learning_rate": 7.2978878368536055e-06,
335
- "loss": 0.05219663143157959,
336
  "step": 1025
337
  },
338
  {
339
- "epoch": 3.442622950819672,
340
- "grad_norm": 12.502548217773438,
341
- "learning_rate": 6.933721777130372e-06,
342
- "loss": 0.06849228858947753,
343
  "step": 1050
344
  },
345
  {
346
- "epoch": 3.5245901639344264,
347
- "grad_norm": 1.6993861198425293,
348
- "learning_rate": 6.569555717407138e-06,
349
- "loss": 0.08783550262451172,
350
  "step": 1075
351
  },
352
  {
353
- "epoch": 3.6065573770491803,
354
- "grad_norm": 0.06551510095596313,
355
- "learning_rate": 6.2053896576839045e-06,
356
- "loss": 0.049420347213745115,
357
  "step": 1100
358
  },
359
  {
360
- "epoch": 3.6885245901639343,
361
- "grad_norm": 0.034276798367500305,
362
- "learning_rate": 5.84122359796067e-06,
363
- "loss": 0.05244039058685303,
364
  "step": 1125
365
  },
366
  {
367
- "epoch": 3.7704918032786887,
368
- "grad_norm": 10.901683807373047,
369
- "learning_rate": 5.477057538237437e-06,
370
- "loss": 0.06656317710876465,
371
  "step": 1150
372
  },
373
  {
374
- "epoch": 3.8524590163934427,
375
- "grad_norm": 2.3856894969940186,
376
- "learning_rate": 5.112891478514203e-06,
377
- "loss": 0.06277508735656738,
378
  "step": 1175
379
  },
380
  {
381
- "epoch": 3.9344262295081966,
382
- "grad_norm": 0.018699949607253075,
383
- "learning_rate": 4.748725418790969e-06,
384
- "loss": 0.046858911514282224,
385
  "step": 1200
386
  },
387
  {
388
- "epoch": 4.0,
389
- "eval_accuracy": 0.9662576687116564,
390
- "eval_f1": 0.8047337278106509,
391
- "eval_loss": 0.14547723531723022,
392
- "eval_precision": 0.7555555555555555,
393
- "eval_recall": 0.8607594936708861,
394
- "eval_roc_auc": 0.9600822291998141,
395
- "eval_runtime": 3.1883,
396
- "eval_samples_per_second": 306.745,
397
- "eval_steps_per_second": 9.723,
398
  "step": 1220
399
  }
400
  ],
401
  "logging_steps": 25,
402
- "max_steps": 1525,
403
  "num_input_tokens_seen": 0,
404
  "num_train_epochs": 5,
405
  "save_steps": 500,
@@ -424,7 +398,7 @@
424
  "attributes": {}
425
  }
426
  },
427
- "total_flos": 2566385233981440.0,
428
  "train_batch_size": 16,
429
  "trial_name": null,
430
  "trial_params": null
 
1
  {
2
+ "best_global_step": 610,
3
+ "best_metric": 0.6933962264150944,
4
+ "best_model_checkpoint": "/content/agri-utilization-classifier/transformer/checkpoint-610",
5
+ "epoch": 2.0,
6
  "eval_steps": 500,
7
  "global_step": 1220,
8
  "is_hyper_param_search": false,
 
10
  "is_world_process_zero": true,
11
  "log_history": [
12
  {
13
+ "epoch": 0.040983606557377046,
14
+ "grad_norm": 3.1041932106018066,
15
+ "learning_rate": 1.573770491803279e-06,
16
+ "loss": 0.526744384765625,
17
  "step": 25
18
  },
19
  {
20
+ "epoch": 0.08196721311475409,
21
+ "grad_norm": 4.816498279571533,
22
+ "learning_rate": 3.213114754098361e-06,
23
+ "loss": 0.46380462646484377,
24
  "step": 50
25
  },
26
  {
27
+ "epoch": 0.12295081967213115,
28
+ "grad_norm": 5.105863571166992,
29
+ "learning_rate": 4.8524590163934435e-06,
30
+ "loss": 0.3628129577636719,
31
  "step": 75
32
  },
33
  {
34
+ "epoch": 0.16393442622950818,
35
+ "grad_norm": 3.479628324508667,
36
+ "learning_rate": 6.491803278688526e-06,
37
+ "loss": 0.3321057891845703,
38
  "step": 100
39
  },
40
  {
41
+ "epoch": 0.20491803278688525,
42
+ "grad_norm": 5.271286487579346,
43
+ "learning_rate": 8.131147540983607e-06,
44
+ "loss": 0.29200584411621094,
45
  "step": 125
46
  },
47
  {
48
+ "epoch": 0.2459016393442623,
49
+ "grad_norm": 2.9103591442108154,
50
+ "learning_rate": 9.770491803278689e-06,
51
+ "loss": 0.20765865325927735,
52
  "step": 150
53
  },
54
  {
55
+ "epoch": 0.28688524590163933,
56
+ "grad_norm": 5.441774845123291,
57
+ "learning_rate": 1.1409836065573771e-05,
58
+ "loss": 0.22107566833496095,
59
  "step": 175
60
  },
61
  {
62
+ "epoch": 0.32786885245901637,
63
+ "grad_norm": 17.181110382080078,
64
+ "learning_rate": 1.3049180327868853e-05,
65
+ "loss": 0.22224016189575196,
66
  "step": 200
67
  },
68
  {
69
+ "epoch": 0.36885245901639346,
70
+ "grad_norm": 5.543371677398682,
71
+ "learning_rate": 1.4688524590163935e-05,
72
+ "loss": 0.20201002120971678,
73
  "step": 225
74
  },
75
  {
76
+ "epoch": 0.4098360655737705,
77
+ "grad_norm": 5.812751770019531,
78
+ "learning_rate": 1.6327868852459016e-05,
79
+ "loss": 0.2299608612060547,
80
  "step": 250
81
  },
82
  {
83
+ "epoch": 0.45081967213114754,
84
+ "grad_norm": 2.845670461654663,
85
+ "learning_rate": 1.79672131147541e-05,
86
+ "loss": 0.21673511505126952,
87
  "step": 275
88
  },
89
  {
90
+ "epoch": 0.4918032786885246,
91
+ "grad_norm": 18.98938751220703,
92
+ "learning_rate": 1.9606557377049183e-05,
93
+ "loss": 0.21523273468017579,
94
  "step": 300
95
  },
96
  {
97
+ "epoch": 0.5327868852459017,
98
+ "grad_norm": 5.82402229309082,
99
+ "learning_rate": 1.9861566484517306e-05,
100
+ "loss": 0.15432929039001464,
 
 
 
 
 
 
 
 
 
 
 
 
 
101
  "step": 325
102
  },
103
  {
104
+ "epoch": 0.5737704918032787,
105
+ "grad_norm": 0.48278993368148804,
106
+ "learning_rate": 1.9679417122040073e-05,
107
+ "loss": 0.16572830200195313,
108
  "step": 350
109
  },
110
  {
111
+ "epoch": 0.6147540983606558,
112
+ "grad_norm": 5.946134567260742,
113
+ "learning_rate": 1.9497267759562843e-05,
114
+ "loss": 0.17543071746826172,
115
  "step": 375
116
  },
117
  {
118
+ "epoch": 0.6557377049180327,
119
+ "grad_norm": 13.494661331176758,
120
+ "learning_rate": 1.9315118397085614e-05,
121
+ "loss": 0.15327459335327148,
122
  "step": 400
123
  },
124
  {
125
+ "epoch": 0.6967213114754098,
126
+ "grad_norm": 9.058329582214355,
127
+ "learning_rate": 1.913296903460838e-05,
128
+ "loss": 0.19322505950927735,
129
  "step": 425
130
  },
131
  {
132
+ "epoch": 0.7377049180327869,
133
+ "grad_norm": 9.420262336730957,
134
+ "learning_rate": 1.895081967213115e-05,
135
+ "loss": 0.18702877044677735,
136
  "step": 450
137
  },
138
  {
139
+ "epoch": 0.7786885245901639,
140
+ "grad_norm": 0.29203546047210693,
141
+ "learning_rate": 1.8768670309653917e-05,
142
+ "loss": 0.13424430847167967,
143
  "step": 475
144
  },
145
  {
146
+ "epoch": 0.819672131147541,
147
+ "grad_norm": 5.464226245880127,
148
+ "learning_rate": 1.8586520947176687e-05,
149
+ "loss": 0.1403522300720215,
150
  "step": 500
151
  },
152
  {
153
+ "epoch": 0.860655737704918,
154
+ "grad_norm": 0.3305734395980835,
155
+ "learning_rate": 1.8404371584699454e-05,
156
+ "loss": 0.1768626403808594,
157
  "step": 525
158
  },
159
  {
160
+ "epoch": 0.9016393442622951,
161
+ "grad_norm": 6.791965007781982,
162
+ "learning_rate": 1.8222222222222224e-05,
163
+ "loss": 0.22005313873291016,
164
  "step": 550
165
  },
166
  {
167
+ "epoch": 0.9426229508196722,
168
+ "grad_norm": 3.971740484237671,
169
+ "learning_rate": 1.804007285974499e-05,
170
+ "loss": 0.17844413757324218,
171
  "step": 575
172
  },
173
  {
174
+ "epoch": 0.9836065573770492,
175
+ "grad_norm": 1.6787421703338623,
176
+ "learning_rate": 1.785792349726776e-05,
177
+ "loss": 0.16104537963867188,
178
  "step": 600
179
  },
180
  {
181
+ "epoch": 1.0,
182
+ "eval_accuracy": 0.9376199616122841,
183
+ "eval_f1": 0.6933962264150944,
184
+ "eval_loss": 0.17767289280891418,
185
+ "eval_precision": 0.6533333333333333,
186
+ "eval_recall": 0.7386934673366834,
187
+ "eval_roc_auc": 0.9544433040534235,
188
+ "eval_runtime": 8.7237,
189
+ "eval_samples_per_second": 238.89,
190
+ "eval_steps_per_second": 7.566,
191
  "step": 610
192
  },
193
  {
194
+ "epoch": 1.0245901639344261,
195
+ "grad_norm": 5.573081016540527,
196
+ "learning_rate": 1.7675774134790528e-05,
197
+ "loss": 0.10441858291625977,
198
  "step": 625
199
  },
200
  {
201
+ "epoch": 1.0655737704918034,
202
+ "grad_norm": 0.11036327481269836,
203
+ "learning_rate": 1.7493624772313298e-05,
204
+ "loss": 0.16641632080078125,
205
  "step": 650
206
  },
207
  {
208
+ "epoch": 1.1065573770491803,
209
+ "grad_norm": 6.172032833099365,
210
+ "learning_rate": 1.731147540983607e-05,
211
+ "loss": 0.14472474098205568,
212
  "step": 675
213
  },
214
  {
215
+ "epoch": 1.1475409836065573,
216
+ "grad_norm": 5.863953113555908,
217
+ "learning_rate": 1.7129326047358835e-05,
218
+ "loss": 0.1113892936706543,
219
  "step": 700
220
  },
221
  {
222
+ "epoch": 1.1885245901639343,
223
+ "grad_norm": 0.25357893109321594,
224
+ "learning_rate": 1.6947176684881602e-05,
225
+ "loss": 0.12171897888183594,
226
  "step": 725
227
  },
228
  {
229
+ "epoch": 1.2295081967213115,
230
+ "grad_norm": 4.559972763061523,
231
+ "learning_rate": 1.6765027322404372e-05,
232
+ "loss": 0.08190732002258301,
233
  "step": 750
234
  },
235
  {
236
+ "epoch": 1.2704918032786885,
237
+ "grad_norm": 5.443191051483154,
238
+ "learning_rate": 1.6582877959927142e-05,
239
+ "loss": 0.06097976684570312,
240
  "step": 775
241
  },
242
  {
243
+ "epoch": 1.3114754098360657,
244
+ "grad_norm": 7.672996997833252,
245
+ "learning_rate": 1.6400728597449912e-05,
246
+ "loss": 0.09989359855651855,
247
  "step": 800
248
  },
249
  {
250
+ "epoch": 1.3524590163934427,
251
+ "grad_norm": 0.30564695596694946,
252
+ "learning_rate": 1.621857923497268e-05,
253
+ "loss": 0.08488804817199708,
254
  "step": 825
255
  },
256
  {
257
+ "epoch": 1.3934426229508197,
258
+ "grad_norm": 0.5056689977645874,
259
+ "learning_rate": 1.6036429872495446e-05,
260
+ "loss": 0.15551289558410644,
261
  "step": 850
262
  },
263
  {
264
+ "epoch": 1.4344262295081966,
265
+ "grad_norm": 0.06402106583118439,
266
+ "learning_rate": 1.5854280510018216e-05,
267
+ "loss": 0.1194021987915039,
268
  "step": 875
269
  },
270
  {
271
+ "epoch": 1.4754098360655736,
272
+ "grad_norm": 0.8240995407104492,
273
+ "learning_rate": 1.5672131147540986e-05,
274
+ "loss": 0.11723342895507813,
275
  "step": 900
276
  },
277
  {
278
+ "epoch": 1.5163934426229508,
279
+ "grad_norm": 0.07588805258274078,
280
+ "learning_rate": 1.5489981785063753e-05,
281
+ "loss": 0.12805092811584473,
 
 
 
 
 
 
 
 
 
 
 
 
 
282
  "step": 925
283
  },
284
  {
285
+ "epoch": 1.5573770491803278,
286
+ "grad_norm": 29.323896408081055,
287
+ "learning_rate": 1.5307832422586523e-05,
288
+ "loss": 0.15534672737121583,
289
  "step": 950
290
  },
291
  {
292
+ "epoch": 1.598360655737705,
293
+ "grad_norm": 11.02952766418457,
294
+ "learning_rate": 1.512568306010929e-05,
295
+ "loss": 0.1811429214477539,
296
  "step": 975
297
  },
298
  {
299
+ "epoch": 1.639344262295082,
300
+ "grad_norm": 0.9738644957542419,
301
+ "learning_rate": 1.494353369763206e-05,
302
+ "loss": 0.1270007610321045,
303
  "step": 1000
304
  },
305
  {
306
+ "epoch": 1.680327868852459,
307
+ "grad_norm": 2.311349630355835,
308
+ "learning_rate": 1.4761384335154829e-05,
309
+ "loss": 0.1843573570251465,
310
  "step": 1025
311
  },
312
  {
313
+ "epoch": 1.721311475409836,
314
+ "grad_norm": 7.105762004852295,
315
+ "learning_rate": 1.4579234972677595e-05,
316
+ "loss": 0.15470240592956544,
317
  "step": 1050
318
  },
319
  {
320
+ "epoch": 1.762295081967213,
321
+ "grad_norm": 9.120081901550293,
322
+ "learning_rate": 1.4397085610200366e-05,
323
+ "loss": 0.1170622444152832,
324
  "step": 1075
325
  },
326
  {
327
+ "epoch": 1.8032786885245902,
328
+ "grad_norm": 0.09794076532125473,
329
+ "learning_rate": 1.4214936247723134e-05,
330
+ "loss": 0.14307721138000487,
331
  "step": 1100
332
  },
333
  {
334
+ "epoch": 1.8442622950819674,
335
+ "grad_norm": 10.330300331115723,
336
+ "learning_rate": 1.4032786885245904e-05,
337
+ "loss": 0.15037315368652343,
338
  "step": 1125
339
  },
340
  {
341
+ "epoch": 1.8852459016393444,
342
+ "grad_norm": 0.7867186069488525,
343
+ "learning_rate": 1.3850637522768671e-05,
344
+ "loss": 0.10941274642944336,
345
  "step": 1150
346
  },
347
  {
348
+ "epoch": 1.9262295081967213,
349
+ "grad_norm": 7.952847003936768,
350
+ "learning_rate": 1.366848816029144e-05,
351
+ "loss": 0.1380799674987793,
352
  "step": 1175
353
  },
354
  {
355
+ "epoch": 1.9672131147540983,
356
+ "grad_norm": 11.602045059204102,
357
+ "learning_rate": 1.348633879781421e-05,
358
+ "loss": 0.07221244812011719,
359
  "step": 1200
360
  },
361
  {
362
+ "epoch": 2.0,
363
+ "eval_accuracy": 0.935700575815739,
364
+ "eval_f1": 0.6854460093896714,
365
+ "eval_loss": 0.24323046207427979,
366
+ "eval_precision": 0.6431718061674009,
367
+ "eval_recall": 0.7336683417085427,
368
+ "eval_roc_auc": 0.9480599282886581,
369
+ "eval_runtime": 8.6243,
370
+ "eval_samples_per_second": 241.642,
371
+ "eval_steps_per_second": 7.653,
372
  "step": 1220
373
  }
374
  ],
375
  "logging_steps": 25,
376
+ "max_steps": 3050,
377
  "num_input_tokens_seen": 0,
378
  "num_train_epochs": 5,
379
  "save_steps": 500,
 
398
  "attributes": {}
399
  }
400
  },
401
+ "total_flos": 2566122122926080.0,
402
  "train_batch_size": 16,
403
  "trial_name": null,
404
  "trial_params": null
transformer/checkpoint-1220/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3c60366894b25ead0379e8d97e61f1123e1ad4786f5e41a8bc70f2d7bc8901f5
3
- size 5329
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:26170ff2d75562c83f88182ac9301ad3566752c384b38d15219d8c3352efbebb
3
+ size 5201
transformer/checkpoint-1830/config.json ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_cross_attention": false,
3
+ "architectures": [
4
+ "XLMRobertaForSequenceClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "dtype": "float32",
10
+ "eos_token_id": 2,
11
+ "hidden_act": "gelu",
12
+ "hidden_dropout_prob": 0.1,
13
+ "hidden_size": 768,
14
+ "id2label": {
15
+ "0": "NOT_RELEVANT",
16
+ "1": "RELEVANT"
17
+ },
18
+ "initializer_range": 0.02,
19
+ "intermediate_size": 3072,
20
+ "is_decoder": false,
21
+ "label2id": {
22
+ "NOT_RELEVANT": 0,
23
+ "RELEVANT": 1
24
+ },
25
+ "layer_norm_eps": 1e-05,
26
+ "max_position_embeddings": 514,
27
+ "model_type": "xlm-roberta",
28
+ "num_attention_heads": 12,
29
+ "num_hidden_layers": 12,
30
+ "output_past": true,
31
+ "pad_token_id": 1,
32
+ "position_embedding_type": "absolute",
33
+ "problem_type": "single_label_classification",
34
+ "tie_word_embeddings": true,
35
+ "transformers_version": "5.10.2",
36
+ "type_vocab_size": 1,
37
+ "use_cache": false,
38
+ "vocab_size": 250002
39
+ }
transformer/checkpoint-1830/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f2d36571bab40691e34928937e3b6273dc9f93cd7ff5063c0b0f56f6c81c617e
3
+ size 1112205008
transformer/checkpoint-1830/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cd40f5b113902e4d4d3cafce7879adf4a0cd4adc135b994b1df34ea3f00be36d
3
+ size 2224532875
transformer/checkpoint-1830/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:396753321c3a3c714b0c16546663b596b2b7dc320914daeff4f4502205877de5
3
+ size 14645
transformer/checkpoint-1830/scaler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e491b7c283e1accd9af8c8149230ddc3a5c9734dc7607b730682b67947f9d5a
3
+ size 1383
transformer/checkpoint-1830/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:54b2dfae694f5d9051517fe489620541ef50f0d13ff4da58515c39d36e6dbf34
3
+ size 1465
transformer/checkpoint-1830/tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cc02d42fb2a10276563109e2287cc0dbe6b595d5b3b3401c7cfeffc0b7e20270
3
+ size 17098351
transformer/checkpoint-1830/tokenizer_config.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": true,
3
+ "backend": "tokenizers",
4
+ "bos_token": "<s>",
5
+ "cls_token": "<s>",
6
+ "eos_token": "</s>",
7
+ "is_local": false,
8
+ "local_files_only": false,
9
+ "mask_token": "<mask>",
10
+ "model_max_length": 512,
11
+ "pad_token": "<pad>",
12
+ "sep_token": "</s>",
13
+ "tokenizer_class": "XLMRobertaTokenizer",
14
+ "unk_token": "<unk>"
15
+ }
transformer/checkpoint-1830/trainer_state.json ADDED
@@ -0,0 +1,593 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_global_step": 610,
3
+ "best_metric": 0.6933962264150944,
4
+ "best_model_checkpoint": "/content/agri-utilization-classifier/transformer/checkpoint-610",
5
+ "epoch": 3.0,
6
+ "eval_steps": 500,
7
+ "global_step": 1830,
8
+ "is_hyper_param_search": false,
9
+ "is_local_process_zero": true,
10
+ "is_world_process_zero": true,
11
+ "log_history": [
12
+ {
13
+ "epoch": 0.040983606557377046,
14
+ "grad_norm": 3.1041932106018066,
15
+ "learning_rate": 1.573770491803279e-06,
16
+ "loss": 0.526744384765625,
17
+ "step": 25
18
+ },
19
+ {
20
+ "epoch": 0.08196721311475409,
21
+ "grad_norm": 4.816498279571533,
22
+ "learning_rate": 3.213114754098361e-06,
23
+ "loss": 0.46380462646484377,
24
+ "step": 50
25
+ },
26
+ {
27
+ "epoch": 0.12295081967213115,
28
+ "grad_norm": 5.105863571166992,
29
+ "learning_rate": 4.8524590163934435e-06,
30
+ "loss": 0.3628129577636719,
31
+ "step": 75
32
+ },
33
+ {
34
+ "epoch": 0.16393442622950818,
35
+ "grad_norm": 3.479628324508667,
36
+ "learning_rate": 6.491803278688526e-06,
37
+ "loss": 0.3321057891845703,
38
+ "step": 100
39
+ },
40
+ {
41
+ "epoch": 0.20491803278688525,
42
+ "grad_norm": 5.271286487579346,
43
+ "learning_rate": 8.131147540983607e-06,
44
+ "loss": 0.29200584411621094,
45
+ "step": 125
46
+ },
47
+ {
48
+ "epoch": 0.2459016393442623,
49
+ "grad_norm": 2.9103591442108154,
50
+ "learning_rate": 9.770491803278689e-06,
51
+ "loss": 0.20765865325927735,
52
+ "step": 150
53
+ },
54
+ {
55
+ "epoch": 0.28688524590163933,
56
+ "grad_norm": 5.441774845123291,
57
+ "learning_rate": 1.1409836065573771e-05,
58
+ "loss": 0.22107566833496095,
59
+ "step": 175
60
+ },
61
+ {
62
+ "epoch": 0.32786885245901637,
63
+ "grad_norm": 17.181110382080078,
64
+ "learning_rate": 1.3049180327868853e-05,
65
+ "loss": 0.22224016189575196,
66
+ "step": 200
67
+ },
68
+ {
69
+ "epoch": 0.36885245901639346,
70
+ "grad_norm": 5.543371677398682,
71
+ "learning_rate": 1.4688524590163935e-05,
72
+ "loss": 0.20201002120971678,
73
+ "step": 225
74
+ },
75
+ {
76
+ "epoch": 0.4098360655737705,
77
+ "grad_norm": 5.812751770019531,
78
+ "learning_rate": 1.6327868852459016e-05,
79
+ "loss": 0.2299608612060547,
80
+ "step": 250
81
+ },
82
+ {
83
+ "epoch": 0.45081967213114754,
84
+ "grad_norm": 2.845670461654663,
85
+ "learning_rate": 1.79672131147541e-05,
86
+ "loss": 0.21673511505126952,
87
+ "step": 275
88
+ },
89
+ {
90
+ "epoch": 0.4918032786885246,
91
+ "grad_norm": 18.98938751220703,
92
+ "learning_rate": 1.9606557377049183e-05,
93
+ "loss": 0.21523273468017579,
94
+ "step": 300
95
+ },
96
+ {
97
+ "epoch": 0.5327868852459017,
98
+ "grad_norm": 5.82402229309082,
99
+ "learning_rate": 1.9861566484517306e-05,
100
+ "loss": 0.15432929039001464,
101
+ "step": 325
102
+ },
103
+ {
104
+ "epoch": 0.5737704918032787,
105
+ "grad_norm": 0.48278993368148804,
106
+ "learning_rate": 1.9679417122040073e-05,
107
+ "loss": 0.16572830200195313,
108
+ "step": 350
109
+ },
110
+ {
111
+ "epoch": 0.6147540983606558,
112
+ "grad_norm": 5.946134567260742,
113
+ "learning_rate": 1.9497267759562843e-05,
114
+ "loss": 0.17543071746826172,
115
+ "step": 375
116
+ },
117
+ {
118
+ "epoch": 0.6557377049180327,
119
+ "grad_norm": 13.494661331176758,
120
+ "learning_rate": 1.9315118397085614e-05,
121
+ "loss": 0.15327459335327148,
122
+ "step": 400
123
+ },
124
+ {
125
+ "epoch": 0.6967213114754098,
126
+ "grad_norm": 9.058329582214355,
127
+ "learning_rate": 1.913296903460838e-05,
128
+ "loss": 0.19322505950927735,
129
+ "step": 425
130
+ },
131
+ {
132
+ "epoch": 0.7377049180327869,
133
+ "grad_norm": 9.420262336730957,
134
+ "learning_rate": 1.895081967213115e-05,
135
+ "loss": 0.18702877044677735,
136
+ "step": 450
137
+ },
138
+ {
139
+ "epoch": 0.7786885245901639,
140
+ "grad_norm": 0.29203546047210693,
141
+ "learning_rate": 1.8768670309653917e-05,
142
+ "loss": 0.13424430847167967,
143
+ "step": 475
144
+ },
145
+ {
146
+ "epoch": 0.819672131147541,
147
+ "grad_norm": 5.464226245880127,
148
+ "learning_rate": 1.8586520947176687e-05,
149
+ "loss": 0.1403522300720215,
150
+ "step": 500
151
+ },
152
+ {
153
+ "epoch": 0.860655737704918,
154
+ "grad_norm": 0.3305734395980835,
155
+ "learning_rate": 1.8404371584699454e-05,
156
+ "loss": 0.1768626403808594,
157
+ "step": 525
158
+ },
159
+ {
160
+ "epoch": 0.9016393442622951,
161
+ "grad_norm": 6.791965007781982,
162
+ "learning_rate": 1.8222222222222224e-05,
163
+ "loss": 0.22005313873291016,
164
+ "step": 550
165
+ },
166
+ {
167
+ "epoch": 0.9426229508196722,
168
+ "grad_norm": 3.971740484237671,
169
+ "learning_rate": 1.804007285974499e-05,
170
+ "loss": 0.17844413757324218,
171
+ "step": 575
172
+ },
173
+ {
174
+ "epoch": 0.9836065573770492,
175
+ "grad_norm": 1.6787421703338623,
176
+ "learning_rate": 1.785792349726776e-05,
177
+ "loss": 0.16104537963867188,
178
+ "step": 600
179
+ },
180
+ {
181
+ "epoch": 1.0,
182
+ "eval_accuracy": 0.9376199616122841,
183
+ "eval_f1": 0.6933962264150944,
184
+ "eval_loss": 0.17767289280891418,
185
+ "eval_precision": 0.6533333333333333,
186
+ "eval_recall": 0.7386934673366834,
187
+ "eval_roc_auc": 0.9544433040534235,
188
+ "eval_runtime": 8.7237,
189
+ "eval_samples_per_second": 238.89,
190
+ "eval_steps_per_second": 7.566,
191
+ "step": 610
192
+ },
193
+ {
194
+ "epoch": 1.0245901639344261,
195
+ "grad_norm": 5.573081016540527,
196
+ "learning_rate": 1.7675774134790528e-05,
197
+ "loss": 0.10441858291625977,
198
+ "step": 625
199
+ },
200
+ {
201
+ "epoch": 1.0655737704918034,
202
+ "grad_norm": 0.11036327481269836,
203
+ "learning_rate": 1.7493624772313298e-05,
204
+ "loss": 0.16641632080078125,
205
+ "step": 650
206
+ },
207
+ {
208
+ "epoch": 1.1065573770491803,
209
+ "grad_norm": 6.172032833099365,
210
+ "learning_rate": 1.731147540983607e-05,
211
+ "loss": 0.14472474098205568,
212
+ "step": 675
213
+ },
214
+ {
215
+ "epoch": 1.1475409836065573,
216
+ "grad_norm": 5.863953113555908,
217
+ "learning_rate": 1.7129326047358835e-05,
218
+ "loss": 0.1113892936706543,
219
+ "step": 700
220
+ },
221
+ {
222
+ "epoch": 1.1885245901639343,
223
+ "grad_norm": 0.25357893109321594,
224
+ "learning_rate": 1.6947176684881602e-05,
225
+ "loss": 0.12171897888183594,
226
+ "step": 725
227
+ },
228
+ {
229
+ "epoch": 1.2295081967213115,
230
+ "grad_norm": 4.559972763061523,
231
+ "learning_rate": 1.6765027322404372e-05,
232
+ "loss": 0.08190732002258301,
233
+ "step": 750
234
+ },
235
+ {
236
+ "epoch": 1.2704918032786885,
237
+ "grad_norm": 5.443191051483154,
238
+ "learning_rate": 1.6582877959927142e-05,
239
+ "loss": 0.06097976684570312,
240
+ "step": 775
241
+ },
242
+ {
243
+ "epoch": 1.3114754098360657,
244
+ "grad_norm": 7.672996997833252,
245
+ "learning_rate": 1.6400728597449912e-05,
246
+ "loss": 0.09989359855651855,
247
+ "step": 800
248
+ },
249
+ {
250
+ "epoch": 1.3524590163934427,
251
+ "grad_norm": 0.30564695596694946,
252
+ "learning_rate": 1.621857923497268e-05,
253
+ "loss": 0.08488804817199708,
254
+ "step": 825
255
+ },
256
+ {
257
+ "epoch": 1.3934426229508197,
258
+ "grad_norm": 0.5056689977645874,
259
+ "learning_rate": 1.6036429872495446e-05,
260
+ "loss": 0.15551289558410644,
261
+ "step": 850
262
+ },
263
+ {
264
+ "epoch": 1.4344262295081966,
265
+ "grad_norm": 0.06402106583118439,
266
+ "learning_rate": 1.5854280510018216e-05,
267
+ "loss": 0.1194021987915039,
268
+ "step": 875
269
+ },
270
+ {
271
+ "epoch": 1.4754098360655736,
272
+ "grad_norm": 0.8240995407104492,
273
+ "learning_rate": 1.5672131147540986e-05,
274
+ "loss": 0.11723342895507813,
275
+ "step": 900
276
+ },
277
+ {
278
+ "epoch": 1.5163934426229508,
279
+ "grad_norm": 0.07588805258274078,
280
+ "learning_rate": 1.5489981785063753e-05,
281
+ "loss": 0.12805092811584473,
282
+ "step": 925
283
+ },
284
+ {
285
+ "epoch": 1.5573770491803278,
286
+ "grad_norm": 29.323896408081055,
287
+ "learning_rate": 1.5307832422586523e-05,
288
+ "loss": 0.15534672737121583,
289
+ "step": 950
290
+ },
291
+ {
292
+ "epoch": 1.598360655737705,
293
+ "grad_norm": 11.02952766418457,
294
+ "learning_rate": 1.512568306010929e-05,
295
+ "loss": 0.1811429214477539,
296
+ "step": 975
297
+ },
298
+ {
299
+ "epoch": 1.639344262295082,
300
+ "grad_norm": 0.9738644957542419,
301
+ "learning_rate": 1.494353369763206e-05,
302
+ "loss": 0.1270007610321045,
303
+ "step": 1000
304
+ },
305
+ {
306
+ "epoch": 1.680327868852459,
307
+ "grad_norm": 2.311349630355835,
308
+ "learning_rate": 1.4761384335154829e-05,
309
+ "loss": 0.1843573570251465,
310
+ "step": 1025
311
+ },
312
+ {
313
+ "epoch": 1.721311475409836,
314
+ "grad_norm": 7.105762004852295,
315
+ "learning_rate": 1.4579234972677595e-05,
316
+ "loss": 0.15470240592956544,
317
+ "step": 1050
318
+ },
319
+ {
320
+ "epoch": 1.762295081967213,
321
+ "grad_norm": 9.120081901550293,
322
+ "learning_rate": 1.4397085610200366e-05,
323
+ "loss": 0.1170622444152832,
324
+ "step": 1075
325
+ },
326
+ {
327
+ "epoch": 1.8032786885245902,
328
+ "grad_norm": 0.09794076532125473,
329
+ "learning_rate": 1.4214936247723134e-05,
330
+ "loss": 0.14307721138000487,
331
+ "step": 1100
332
+ },
333
+ {
334
+ "epoch": 1.8442622950819674,
335
+ "grad_norm": 10.330300331115723,
336
+ "learning_rate": 1.4032786885245904e-05,
337
+ "loss": 0.15037315368652343,
338
+ "step": 1125
339
+ },
340
+ {
341
+ "epoch": 1.8852459016393444,
342
+ "grad_norm": 0.7867186069488525,
343
+ "learning_rate": 1.3850637522768671e-05,
344
+ "loss": 0.10941274642944336,
345
+ "step": 1150
346
+ },
347
+ {
348
+ "epoch": 1.9262295081967213,
349
+ "grad_norm": 7.952847003936768,
350
+ "learning_rate": 1.366848816029144e-05,
351
+ "loss": 0.1380799674987793,
352
+ "step": 1175
353
+ },
354
+ {
355
+ "epoch": 1.9672131147540983,
356
+ "grad_norm": 11.602045059204102,
357
+ "learning_rate": 1.348633879781421e-05,
358
+ "loss": 0.07221244812011719,
359
+ "step": 1200
360
+ },
361
+ {
362
+ "epoch": 2.0,
363
+ "eval_accuracy": 0.935700575815739,
364
+ "eval_f1": 0.6854460093896714,
365
+ "eval_loss": 0.24323046207427979,
366
+ "eval_precision": 0.6431718061674009,
367
+ "eval_recall": 0.7336683417085427,
368
+ "eval_roc_auc": 0.9480599282886581,
369
+ "eval_runtime": 8.6243,
370
+ "eval_samples_per_second": 241.642,
371
+ "eval_steps_per_second": 7.653,
372
+ "step": 1220
373
+ },
374
+ {
375
+ "epoch": 2.0081967213114753,
376
+ "grad_norm": 4.202751636505127,
377
+ "learning_rate": 1.3304189435336978e-05,
378
+ "loss": 0.16997112274169923,
379
+ "step": 1225
380
+ },
381
+ {
382
+ "epoch": 2.0491803278688523,
383
+ "grad_norm": 0.40695664286613464,
384
+ "learning_rate": 1.3122040072859745e-05,
385
+ "loss": 0.0950162410736084,
386
+ "step": 1250
387
+ },
388
+ {
389
+ "epoch": 2.0901639344262297,
390
+ "grad_norm": 2.8617703914642334,
391
+ "learning_rate": 1.2939890710382515e-05,
392
+ "loss": 0.10194917678833008,
393
+ "step": 1275
394
+ },
395
+ {
396
+ "epoch": 2.1311475409836067,
397
+ "grad_norm": 0.08977202326059341,
398
+ "learning_rate": 1.2757741347905283e-05,
399
+ "loss": 0.0662102746963501,
400
+ "step": 1300
401
+ },
402
+ {
403
+ "epoch": 2.1721311475409837,
404
+ "grad_norm": 0.10930905491113663,
405
+ "learning_rate": 1.2575591985428054e-05,
406
+ "loss": 0.07460547924041748,
407
+ "step": 1325
408
+ },
409
+ {
410
+ "epoch": 2.2131147540983607,
411
+ "grad_norm": 11.703680038452148,
412
+ "learning_rate": 1.239344262295082e-05,
413
+ "loss": 0.11810153961181641,
414
+ "step": 1350
415
+ },
416
+ {
417
+ "epoch": 2.2540983606557377,
418
+ "grad_norm": 3.0462427139282227,
419
+ "learning_rate": 1.2211293260473589e-05,
420
+ "loss": 0.11896968841552734,
421
+ "step": 1375
422
+ },
423
+ {
424
+ "epoch": 2.2950819672131146,
425
+ "grad_norm": 11.854302406311035,
426
+ "learning_rate": 1.2029143897996359e-05,
427
+ "loss": 0.10016871452331542,
428
+ "step": 1400
429
+ },
430
+ {
431
+ "epoch": 2.3360655737704916,
432
+ "grad_norm": 5.737449645996094,
433
+ "learning_rate": 1.1846994535519127e-05,
434
+ "loss": 0.12308047294616699,
435
+ "step": 1425
436
+ },
437
+ {
438
+ "epoch": 2.3770491803278686,
439
+ "grad_norm": 3.3703978061676025,
440
+ "learning_rate": 1.1664845173041894e-05,
441
+ "loss": 0.10508214950561523,
442
+ "step": 1450
443
+ },
444
+ {
445
+ "epoch": 2.418032786885246,
446
+ "grad_norm": 15.429414749145508,
447
+ "learning_rate": 1.1482695810564664e-05,
448
+ "loss": 0.05040365219116211,
449
+ "step": 1475
450
+ },
451
+ {
452
+ "epoch": 2.459016393442623,
453
+ "grad_norm": 14.725594520568848,
454
+ "learning_rate": 1.1300546448087433e-05,
455
+ "loss": 0.13564690589904785,
456
+ "step": 1500
457
+ },
458
+ {
459
+ "epoch": 2.5,
460
+ "grad_norm": 0.18430812656879425,
461
+ "learning_rate": 1.1118397085610201e-05,
462
+ "loss": 0.08556642532348632,
463
+ "step": 1525
464
+ },
465
+ {
466
+ "epoch": 2.540983606557377,
467
+ "grad_norm": 0.062474410980939865,
468
+ "learning_rate": 1.0936247723132968e-05,
469
+ "loss": 0.08463016510009766,
470
+ "step": 1550
471
+ },
472
+ {
473
+ "epoch": 2.581967213114754,
474
+ "grad_norm": 0.20876172184944153,
475
+ "learning_rate": 1.0754098360655738e-05,
476
+ "loss": 0.1799280548095703,
477
+ "step": 1575
478
+ },
479
+ {
480
+ "epoch": 2.6229508196721314,
481
+ "grad_norm": 6.137902736663818,
482
+ "learning_rate": 1.0571948998178507e-05,
483
+ "loss": 0.12623875617980956,
484
+ "step": 1600
485
+ },
486
+ {
487
+ "epoch": 2.663934426229508,
488
+ "grad_norm": 17.05773162841797,
489
+ "learning_rate": 1.0389799635701277e-05,
490
+ "loss": 0.09255536079406738,
491
+ "step": 1625
492
+ },
493
+ {
494
+ "epoch": 2.7049180327868854,
495
+ "grad_norm": 1.2578394412994385,
496
+ "learning_rate": 1.0207650273224044e-05,
497
+ "loss": 0.0888406753540039,
498
+ "step": 1650
499
+ },
500
+ {
501
+ "epoch": 2.7459016393442623,
502
+ "grad_norm": 15.845796585083008,
503
+ "learning_rate": 1.0025500910746812e-05,
504
+ "loss": 0.10764387130737305,
505
+ "step": 1675
506
+ },
507
+ {
508
+ "epoch": 2.7868852459016393,
509
+ "grad_norm": 12.588105201721191,
510
+ "learning_rate": 9.843351548269582e-06,
511
+ "loss": 0.12151247024536133,
512
+ "step": 1700
513
+ },
514
+ {
515
+ "epoch": 2.8278688524590163,
516
+ "grad_norm": 7.840113162994385,
517
+ "learning_rate": 9.66120218579235e-06,
518
+ "loss": 0.1787535858154297,
519
+ "step": 1725
520
+ },
521
+ {
522
+ "epoch": 2.8688524590163933,
523
+ "grad_norm": 5.082641124725342,
524
+ "learning_rate": 9.47905282331512e-06,
525
+ "loss": 0.07576138019561768,
526
+ "step": 1750
527
+ },
528
+ {
529
+ "epoch": 2.9098360655737707,
530
+ "grad_norm": 14.185094833374023,
531
+ "learning_rate": 9.296903460837888e-06,
532
+ "loss": 0.1186930274963379,
533
+ "step": 1775
534
+ },
535
+ {
536
+ "epoch": 2.9508196721311473,
537
+ "grad_norm": 31.60809898376465,
538
+ "learning_rate": 9.114754098360656e-06,
539
+ "loss": 0.10553586959838868,
540
+ "step": 1800
541
+ },
542
+ {
543
+ "epoch": 2.9918032786885247,
544
+ "grad_norm": 2.9622690677642822,
545
+ "learning_rate": 8.932604735883426e-06,
546
+ "loss": 0.054716997146606446,
547
+ "step": 1825
548
+ },
549
+ {
550
+ "epoch": 3.0,
551
+ "eval_accuracy": 0.9414587332053743,
552
+ "eval_f1": 0.6903553299492385,
553
+ "eval_loss": 0.24261149764060974,
554
+ "eval_precision": 0.6974358974358974,
555
+ "eval_recall": 0.6834170854271356,
556
+ "eval_roc_auc": 0.9521560054916492,
557
+ "eval_runtime": 8.6527,
558
+ "eval_samples_per_second": 240.849,
559
+ "eval_steps_per_second": 7.628,
560
+ "step": 1830
561
+ }
562
+ ],
563
+ "logging_steps": 25,
564
+ "max_steps": 3050,
565
+ "num_input_tokens_seen": 0,
566
+ "num_train_epochs": 5,
567
+ "save_steps": 500,
568
+ "stateful_callbacks": {
569
+ "EarlyStoppingCallback": {
570
+ "args": {
571
+ "early_stopping_patience": 2,
572
+ "early_stopping_threshold": 0.0
573
+ },
574
+ "attributes": {
575
+ "early_stopping_patience_counter": 2
576
+ }
577
+ },
578
+ "TrainerControl": {
579
+ "args": {
580
+ "should_epoch_stop": false,
581
+ "should_evaluate": false,
582
+ "should_log": false,
583
+ "should_save": true,
584
+ "should_training_stop": true
585
+ },
586
+ "attributes": {}
587
+ }
588
+ },
589
+ "total_flos": 3849183184389120.0,
590
+ "train_batch_size": 16,
591
+ "trial_name": null,
592
+ "trial_params": null
593
+ }
transformer/checkpoint-1830/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:26170ff2d75562c83f88182ac9301ad3566752c384b38d15219d8c3352efbebb
3
+ size 5201
transformer/checkpoint-610/config.json CHANGED
@@ -32,7 +32,7 @@
32
  "position_embedding_type": "absolute",
33
  "problem_type": "single_label_classification",
34
  "tie_word_embeddings": true,
35
- "transformers_version": "5.9.0",
36
  "type_vocab_size": 1,
37
  "use_cache": false,
38
  "vocab_size": 250002
 
32
  "position_embedding_type": "absolute",
33
  "problem_type": "single_label_classification",
34
  "tie_word_embeddings": true,
35
+ "transformers_version": "5.10.2",
36
  "type_vocab_size": 1,
37
  "use_cache": false,
38
  "vocab_size": 250002
transformer/checkpoint-610/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8c2e8002a8b39d6b2b729d256b3d4cff3d522204ecb453b2bd5c433f9bd4944f
3
  size 1112205008
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:150736722d2137e368c215a6ded5ca83348b547235eed16e8939af8b89077765
3
  size 1112205008
transformer/checkpoint-610/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:36fd23804b528193a6fc5999a821ef8809fb31a6efd5c61fe763007795ad7dff
3
  size 2224532875
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e116ee1524d3a0af2639c7796fdc33da882323ef9133d8950b493efc3fdc234
3
  size 2224532875
transformer/checkpoint-610/rng_state.pth CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fafd9c24dc9711309db2e4113a63ff7120e22ca104346dfb40523f00ae210f76
3
  size 14645
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f79ebe41a952d58a414acbb1200490974edd9f70a35228ee802fd1f53139fa66
3
  size 14645
transformer/checkpoint-610/scaler.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:50d9d499a5525a1f496c3b9a272dbba833f43becb5d780497724ade85d68372c
3
  size 1383
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:88e49d8c13a6eacfd8373ea54c57924180b7dfff65f37e60feb7ea51f503d158
3
  size 1383
transformer/checkpoint-610/scheduler.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d364349216cf58d042027a258508346e7afb967d9966c7e61a3b5de011c04767
3
  size 1465
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3342a44ed126fda993f7f2d332ee7ff352916f58e467cf3c50397c3413bfcc8a
3
  size 1465
transformer/checkpoint-610/trainer_state.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "best_global_step": 305,
3
- "best_metric": 0.7721518987341772,
4
- "best_model_checkpoint": "/content/agri-utilization-classifier/transformer/checkpoint-305",
5
- "epoch": 2.0,
6
  "eval_steps": 500,
7
  "global_step": 610,
8
  "is_hyper_param_search": false,
@@ -10,202 +10,189 @@
10
  "is_world_process_zero": true,
11
  "log_history": [
12
  {
13
- "epoch": 0.08196721311475409,
14
- "grad_norm": 6.055062770843506,
15
- "learning_rate": 3.157894736842105e-06,
16
- "loss": 0.62972900390625,
17
  "step": 25
18
  },
19
  {
20
- "epoch": 0.16393442622950818,
21
- "grad_norm": 10.6914701461792,
22
- "learning_rate": 6.447368421052632e-06,
23
- "loss": 0.44850738525390627,
24
  "step": 50
25
  },
26
  {
27
- "epoch": 0.2459016393442623,
28
- "grad_norm": 6.670228481292725,
29
- "learning_rate": 9.736842105263159e-06,
30
- "loss": 0.3566379165649414,
31
  "step": 75
32
  },
33
  {
34
- "epoch": 0.32786885245901637,
35
- "grad_norm": 2.589911937713623,
36
- "learning_rate": 1.3026315789473684e-05,
37
- "loss": 0.2718839645385742,
38
  "step": 100
39
  },
40
  {
41
- "epoch": 0.4098360655737705,
42
- "grad_norm": 22.02676773071289,
43
- "learning_rate": 1.6315789473684213e-05,
44
- "loss": 0.1922766876220703,
45
  "step": 125
46
  },
47
  {
48
- "epoch": 0.4918032786885246,
49
- "grad_norm": 2.6362855434417725,
50
- "learning_rate": 1.960526315789474e-05,
51
- "loss": 0.1837622833251953,
52
  "step": 150
53
  },
54
  {
55
- "epoch": 0.5737704918032787,
56
- "grad_norm": 3.478484630584717,
57
- "learning_rate": 1.9679533867443555e-05,
58
- "loss": 0.18766048431396484,
59
  "step": 175
60
  },
61
  {
62
- "epoch": 0.6557377049180327,
63
- "grad_norm": 8.077605247497559,
64
- "learning_rate": 1.9315367807720323e-05,
65
- "loss": 0.23830581665039063,
66
  "step": 200
67
  },
68
  {
69
- "epoch": 0.7377049180327869,
70
- "grad_norm": 0.7427046298980713,
71
- "learning_rate": 1.8951201747997088e-05,
72
- "loss": 0.30742517471313474,
73
  "step": 225
74
  },
75
  {
76
- "epoch": 0.819672131147541,
77
- "grad_norm": 36.34975051879883,
78
- "learning_rate": 1.8587035688273852e-05,
79
- "loss": 0.22336017608642578,
80
  "step": 250
81
  },
82
  {
83
- "epoch": 0.9016393442622951,
84
- "grad_norm": 5.215510845184326,
85
- "learning_rate": 1.822286962855062e-05,
86
- "loss": 0.13779294967651368,
87
  "step": 275
88
  },
89
  {
90
- "epoch": 0.9836065573770492,
91
- "grad_norm": 3.551121950149536,
92
- "learning_rate": 1.7858703568827385e-05,
93
- "loss": 0.19200111389160157,
94
  "step": 300
95
  },
96
  {
97
- "epoch": 1.0,
98
- "eval_accuracy": 0.9631901840490797,
99
- "eval_f1": 0.7721518987341772,
100
- "eval_loss": 0.1292734444141388,
101
- "eval_precision": 0.7721518987341772,
102
- "eval_recall": 0.7721518987341772,
103
- "eval_roc_auc": 0.9563720589684741,
104
- "eval_runtime": 3.3396,
105
- "eval_samples_per_second": 292.853,
106
- "eval_steps_per_second": 9.283,
107
- "step": 305
108
- },
109
- {
110
- "epoch": 1.0655737704918034,
111
- "grad_norm": 0.5402449369430542,
112
- "learning_rate": 1.7494537509104153e-05,
113
- "loss": 0.1241053295135498,
114
  "step": 325
115
  },
116
  {
117
- "epoch": 1.1475409836065573,
118
- "grad_norm": 4.476892948150635,
119
- "learning_rate": 1.7130371449380918e-05,
120
- "loss": 0.20724605560302733,
121
  "step": 350
122
  },
123
  {
124
- "epoch": 1.2295081967213115,
125
- "grad_norm": 0.46729782223701477,
126
- "learning_rate": 1.6766205389657686e-05,
127
- "loss": 0.13567353248596192,
128
  "step": 375
129
  },
130
  {
131
- "epoch": 1.3114754098360657,
132
- "grad_norm": 0.1852118819952011,
133
- "learning_rate": 1.640203932993445e-05,
134
- "loss": 0.13295170783996582,
135
  "step": 400
136
  },
137
  {
138
- "epoch": 1.3934426229508197,
139
- "grad_norm": 1.2681413888931274,
140
- "learning_rate": 1.603787327021122e-05,
141
- "loss": 0.2027936363220215,
142
  "step": 425
143
  },
144
  {
145
- "epoch": 1.4754098360655736,
146
- "grad_norm": 7.484091281890869,
147
- "learning_rate": 1.5673707210487983e-05,
148
- "loss": 0.12364128112792969,
149
  "step": 450
150
  },
151
  {
152
- "epoch": 1.5573770491803278,
153
- "grad_norm": 0.46489500999450684,
154
- "learning_rate": 1.530954115076475e-05,
155
- "loss": 0.14407362937927246,
156
  "step": 475
157
  },
158
  {
159
- "epoch": 1.639344262295082,
160
- "grad_norm": 0.20967872440814972,
161
- "learning_rate": 1.4945375091041516e-05,
162
- "loss": 0.12458925247192383,
163
  "step": 500
164
  },
165
  {
166
- "epoch": 1.721311475409836,
167
- "grad_norm": 0.1643747240304947,
168
- "learning_rate": 1.4581209031318282e-05,
169
- "loss": 0.21631996154785157,
170
  "step": 525
171
  },
172
  {
173
- "epoch": 1.8032786885245902,
174
- "grad_norm": 7.073329448699951,
175
- "learning_rate": 1.4217042971595047e-05,
176
- "loss": 0.16043865203857421,
177
  "step": 550
178
  },
179
  {
180
- "epoch": 1.8852459016393444,
181
- "grad_norm": 1.744958758354187,
182
- "learning_rate": 1.3852876911871815e-05,
183
- "loss": 0.0966644287109375,
184
  "step": 575
185
  },
186
  {
187
- "epoch": 1.9672131147540983,
188
- "grad_norm": 12.79035472869873,
189
- "learning_rate": 1.3488710852148582e-05,
190
- "loss": 0.15884541511535644,
191
  "step": 600
192
  },
193
  {
194
- "epoch": 2.0,
195
- "eval_accuracy": 0.9611451942740287,
196
- "eval_f1": 0.7432432432432432,
197
- "eval_loss": 0.13287827372550964,
198
- "eval_precision": 0.7971014492753623,
199
- "eval_recall": 0.6962025316455697,
200
- "eval_roc_auc": 0.9594697343039381,
201
- "eval_runtime": 3.2739,
202
- "eval_samples_per_second": 298.727,
203
- "eval_steps_per_second": 9.469,
204
  "step": 610
205
  }
206
  ],
207
  "logging_steps": 25,
208
- "max_steps": 1525,
209
  "num_input_tokens_seen": 0,
210
  "num_train_epochs": 5,
211
  "save_steps": 500,
@@ -216,7 +203,7 @@
216
  "early_stopping_threshold": 0.0
217
  },
218
  "attributes": {
219
- "early_stopping_patience_counter": 1
220
  }
221
  },
222
  "TrainerControl": {
@@ -230,7 +217,7 @@
230
  "attributes": {}
231
  }
232
  },
233
- "total_flos": 1283192616990720.0,
234
  "train_batch_size": 16,
235
  "trial_name": null,
236
  "trial_params": null
 
1
  {
2
+ "best_global_step": 610,
3
+ "best_metric": 0.6933962264150944,
4
+ "best_model_checkpoint": "/content/agri-utilization-classifier/transformer/checkpoint-610",
5
+ "epoch": 1.0,
6
  "eval_steps": 500,
7
  "global_step": 610,
8
  "is_hyper_param_search": false,
 
10
  "is_world_process_zero": true,
11
  "log_history": [
12
  {
13
+ "epoch": 0.040983606557377046,
14
+ "grad_norm": 3.1041932106018066,
15
+ "learning_rate": 1.573770491803279e-06,
16
+ "loss": 0.526744384765625,
17
  "step": 25
18
  },
19
  {
20
+ "epoch": 0.08196721311475409,
21
+ "grad_norm": 4.816498279571533,
22
+ "learning_rate": 3.213114754098361e-06,
23
+ "loss": 0.46380462646484377,
24
  "step": 50
25
  },
26
  {
27
+ "epoch": 0.12295081967213115,
28
+ "grad_norm": 5.105863571166992,
29
+ "learning_rate": 4.8524590163934435e-06,
30
+ "loss": 0.3628129577636719,
31
  "step": 75
32
  },
33
  {
34
+ "epoch": 0.16393442622950818,
35
+ "grad_norm": 3.479628324508667,
36
+ "learning_rate": 6.491803278688526e-06,
37
+ "loss": 0.3321057891845703,
38
  "step": 100
39
  },
40
  {
41
+ "epoch": 0.20491803278688525,
42
+ "grad_norm": 5.271286487579346,
43
+ "learning_rate": 8.131147540983607e-06,
44
+ "loss": 0.29200584411621094,
45
  "step": 125
46
  },
47
  {
48
+ "epoch": 0.2459016393442623,
49
+ "grad_norm": 2.9103591442108154,
50
+ "learning_rate": 9.770491803278689e-06,
51
+ "loss": 0.20765865325927735,
52
  "step": 150
53
  },
54
  {
55
+ "epoch": 0.28688524590163933,
56
+ "grad_norm": 5.441774845123291,
57
+ "learning_rate": 1.1409836065573771e-05,
58
+ "loss": 0.22107566833496095,
59
  "step": 175
60
  },
61
  {
62
+ "epoch": 0.32786885245901637,
63
+ "grad_norm": 17.181110382080078,
64
+ "learning_rate": 1.3049180327868853e-05,
65
+ "loss": 0.22224016189575196,
66
  "step": 200
67
  },
68
  {
69
+ "epoch": 0.36885245901639346,
70
+ "grad_norm": 5.543371677398682,
71
+ "learning_rate": 1.4688524590163935e-05,
72
+ "loss": 0.20201002120971678,
73
  "step": 225
74
  },
75
  {
76
+ "epoch": 0.4098360655737705,
77
+ "grad_norm": 5.812751770019531,
78
+ "learning_rate": 1.6327868852459016e-05,
79
+ "loss": 0.2299608612060547,
80
  "step": 250
81
  },
82
  {
83
+ "epoch": 0.45081967213114754,
84
+ "grad_norm": 2.845670461654663,
85
+ "learning_rate": 1.79672131147541e-05,
86
+ "loss": 0.21673511505126952,
87
  "step": 275
88
  },
89
  {
90
+ "epoch": 0.4918032786885246,
91
+ "grad_norm": 18.98938751220703,
92
+ "learning_rate": 1.9606557377049183e-05,
93
+ "loss": 0.21523273468017579,
94
  "step": 300
95
  },
96
  {
97
+ "epoch": 0.5327868852459017,
98
+ "grad_norm": 5.82402229309082,
99
+ "learning_rate": 1.9861566484517306e-05,
100
+ "loss": 0.15432929039001464,
 
 
 
 
 
 
 
 
 
 
 
 
 
101
  "step": 325
102
  },
103
  {
104
+ "epoch": 0.5737704918032787,
105
+ "grad_norm": 0.48278993368148804,
106
+ "learning_rate": 1.9679417122040073e-05,
107
+ "loss": 0.16572830200195313,
108
  "step": 350
109
  },
110
  {
111
+ "epoch": 0.6147540983606558,
112
+ "grad_norm": 5.946134567260742,
113
+ "learning_rate": 1.9497267759562843e-05,
114
+ "loss": 0.17543071746826172,
115
  "step": 375
116
  },
117
  {
118
+ "epoch": 0.6557377049180327,
119
+ "grad_norm": 13.494661331176758,
120
+ "learning_rate": 1.9315118397085614e-05,
121
+ "loss": 0.15327459335327148,
122
  "step": 400
123
  },
124
  {
125
+ "epoch": 0.6967213114754098,
126
+ "grad_norm": 9.058329582214355,
127
+ "learning_rate": 1.913296903460838e-05,
128
+ "loss": 0.19322505950927735,
129
  "step": 425
130
  },
131
  {
132
+ "epoch": 0.7377049180327869,
133
+ "grad_norm": 9.420262336730957,
134
+ "learning_rate": 1.895081967213115e-05,
135
+ "loss": 0.18702877044677735,
136
  "step": 450
137
  },
138
  {
139
+ "epoch": 0.7786885245901639,
140
+ "grad_norm": 0.29203546047210693,
141
+ "learning_rate": 1.8768670309653917e-05,
142
+ "loss": 0.13424430847167967,
143
  "step": 475
144
  },
145
  {
146
+ "epoch": 0.819672131147541,
147
+ "grad_norm": 5.464226245880127,
148
+ "learning_rate": 1.8586520947176687e-05,
149
+ "loss": 0.1403522300720215,
150
  "step": 500
151
  },
152
  {
153
+ "epoch": 0.860655737704918,
154
+ "grad_norm": 0.3305734395980835,
155
+ "learning_rate": 1.8404371584699454e-05,
156
+ "loss": 0.1768626403808594,
157
  "step": 525
158
  },
159
  {
160
+ "epoch": 0.9016393442622951,
161
+ "grad_norm": 6.791965007781982,
162
+ "learning_rate": 1.8222222222222224e-05,
163
+ "loss": 0.22005313873291016,
164
  "step": 550
165
  },
166
  {
167
+ "epoch": 0.9426229508196722,
168
+ "grad_norm": 3.971740484237671,
169
+ "learning_rate": 1.804007285974499e-05,
170
+ "loss": 0.17844413757324218,
171
  "step": 575
172
  },
173
  {
174
+ "epoch": 0.9836065573770492,
175
+ "grad_norm": 1.6787421703338623,
176
+ "learning_rate": 1.785792349726776e-05,
177
+ "loss": 0.16104537963867188,
178
  "step": 600
179
  },
180
  {
181
+ "epoch": 1.0,
182
+ "eval_accuracy": 0.9376199616122841,
183
+ "eval_f1": 0.6933962264150944,
184
+ "eval_loss": 0.17767289280891418,
185
+ "eval_precision": 0.6533333333333333,
186
+ "eval_recall": 0.7386934673366834,
187
+ "eval_roc_auc": 0.9544433040534235,
188
+ "eval_runtime": 8.7237,
189
+ "eval_samples_per_second": 238.89,
190
+ "eval_steps_per_second": 7.566,
191
  "step": 610
192
  }
193
  ],
194
  "logging_steps": 25,
195
+ "max_steps": 3050,
196
  "num_input_tokens_seen": 0,
197
  "num_train_epochs": 5,
198
  "save_steps": 500,
 
203
  "early_stopping_threshold": 0.0
204
  },
205
  "attributes": {
206
+ "early_stopping_patience_counter": 0
207
  }
208
  },
209
  "TrainerControl": {
 
217
  "attributes": {}
218
  }
219
  },
220
+ "total_flos": 1283061061463040.0,
221
  "train_batch_size": 16,
222
  "trial_name": null,
223
  "trial_params": null
transformer/checkpoint-610/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3c60366894b25ead0379e8d97e61f1123e1ad4786f5e41a8bc70f2d7bc8901f5
3
- size 5329
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:26170ff2d75562c83f88182ac9301ad3566752c384b38d15219d8c3352efbebb
3
+ size 5201
transformer/config.json CHANGED
@@ -31,16 +31,16 @@
31
  "pad_token_id": 1,
32
  "position_embedding_type": "absolute",
33
  "problem_type": "single_label_classification",
34
- "threshold": 0.4710787534713745,
35
  "tie_word_embeddings": true,
36
- "transformers_version": "5.9.0",
37
  "type_vocab_size": 1,
38
  "use_cache": false,
39
  "validation_threshold_report": {
40
- "f1": 0.829268292682927,
41
- "precision": 0.8,
42
- "recall": 0.8607594936708861,
43
- "threshold": 0.4710787534713745
44
  },
45
  "vocab_size": 250002
46
  }
 
31
  "pad_token_id": 1,
32
  "position_embedding_type": "absolute",
33
  "problem_type": "single_label_classification",
34
+ "threshold": 0.5436205267906189,
35
  "tie_word_embeddings": true,
36
+ "transformers_version": "5.10.2",
37
  "type_vocab_size": 1,
38
  "use_cache": false,
39
  "validation_threshold_report": {
40
+ "f1": 0.6983372921615203,
41
+ "precision": 0.6621621621621622,
42
+ "recall": 0.7386934673366834,
43
+ "threshold": 0.5436205267906189
44
  },
45
  "vocab_size": 250002
46
  }
transformer/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:49a18c813f49f0f53eef5e1646a8e80f88eb366c956b09301312f1a23e9fe977
3
  size 1112205008
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:150736722d2137e368c215a6ded5ca83348b547235eed16e8939af8b89077765
3
  size 1112205008
transformer/test_predictions.csv CHANGED
The diff for this file is too large to render. See raw diff
 
transformer/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3c60366894b25ead0379e8d97e61f1123e1ad4786f5e41a8bc70f2d7bc8901f5
3
- size 5329
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:26170ff2d75562c83f88182ac9301ad3566752c384b38d15219d8c3352efbebb
3
+ size 5201
transformer/validation_predictions.csv CHANGED
The diff for this file is too large to render. See raw diff