Upload folder using huggingface_hub

7adc93a verified about 10 hours ago

6.53 kB

	# AMIS Commodity Classifier Training Report

	- Dataset: `faodl/amis-agri-utilization`
	- Dataset subset: ``
	- Dataset revision: `ada4a04088a98f8f64bc7485c57d4c7f422c2151`
	- Text column: `chunk_text`
	- Label column: `label`
	- Transformer: `FacebookAI/xlm-roberta-base`
	- Generated at: `2026-06-10T20:30:54.345579+00:00`

	## Dataset Summary

	\| Split \| Rows \| Label 0 \| Label 1 \| Unique groups \| Mean text length \|
	\| --- \| ---: \| ---: \| ---: \| ---: \| ---: \|
	\| train \| 4877 \| 4347 \| 530 \| 2513 \| 696.6 \|
	\| validation \| 978 \| 899 \| 79 \| 538 \| 690.6 \|
	\| test \| 1016 \| 904 \| 112 \| 539 \| 690.7 \|

	## Threshold Comparison on Validation Split

	Validation metrics document threshold selection and tuning behavior; test metrics remain the primary estimate of out-of-sample performance.

	\| Model \| Threshold \| Accuracy \| Precision \| Recall \| F1 \| ROC AUC \| Average precision \|
	\| --- \| ---: \| ---: \| ---: \| ---: \| ---: \| ---: \| ---: \|
	\| logistic_tfidf \| 0.500 \| 0.912 \| 0.465 \| 0.582 \| 0.517 \| 0.872 \| 0.594 \|
	\| logistic_tfidf \| 0.608 \| 0.942 \| 0.696 \| 0.494 \| 0.578 \| 0.872 \| 0.594 \|
	\| xgboost_tfidf \| 0.500 \| 0.945 \| 0.931 \| 0.342 \| 0.500 \| 0.823 \| 0.588 \|
	\| xgboost_tfidf \| 0.177 \| 0.934 \| 0.592 \| 0.570 \| 0.581 \| 0.823 \| 0.588 \|
	\| embedding-logistic_sentence_embeddings \| 0.500 \| 0.912 \| 0.476 \| 0.861 \| 0.613 \| 0.953 \| 0.762 \|
	\| embedding-logistic_sentence_embeddings \| 0.722 \| 0.957 \| 0.703 \| 0.810 \| 0.753 \| 0.953 \| 0.762 \|
	\| embedding-svm_sentence_embeddings \| 0.500 \| 0.955 \| 0.807 \| 0.582 \| 0.676 \| 0.952 \| 0.754 \|
	\| embedding-svm_sentence_embeddings \| 0.310 \| 0.957 \| 0.713 \| 0.785 \| 0.747 \| 0.952 \| 0.754 \|
	\| embedding-lightgbm_sentence_embeddings \| 0.500 \| 0.954 \| 0.750 \| 0.646 \| 0.694 \| 0.948 \| 0.782 \|
	\| embedding-lightgbm_sentence_embeddings \| 0.042 \| 0.952 \| 0.670 \| 0.797 \| 0.728 \| 0.948 \| 0.782 \|
	\| transformer \| 0.500 \| 0.964 \| 0.739 \| 0.861 \| 0.795 \| 0.970 \| 0.874 \|
	\| transformer \| 0.853 \| 0.970 \| 0.812 \| 0.823 \| 0.818 \| 0.970 \| 0.874 \|

	## Threshold Comparison on Test Split

	\| Model \| Threshold \| Accuracy \| Precision \| Recall \| F1 \| ROC AUC \| Average precision \|
	\| --- \| ---: \| ---: \| ---: \| ---: \| ---: \| ---: \| ---: \|
	\| logistic_tfidf \| 0.500 \| 0.926 \| 0.691 \| 0.598 \| 0.641 \| 0.899 \| 0.726 \|
	\| logistic_tfidf \| 0.608 \| 0.930 \| 0.902 \| 0.411 \| 0.564 \| 0.899 \| 0.726 \|
	\| xgboost_tfidf \| 0.500 \| 0.924 \| 1.000 \| 0.312 \| 0.476 \| 0.892 \| 0.692 \|
	\| xgboost_tfidf \| 0.177 \| 0.918 \| 0.663 \| 0.527 \| 0.587 \| 0.892 \| 0.692 \|
	\| embedding-logistic_sentence_embeddings \| 0.500 \| 0.891 \| 0.503 \| 0.884 \| 0.641 \| 0.955 \| 0.710 \|
	\| embedding-logistic_sentence_embeddings \| 0.722 \| 0.935 \| 0.689 \| 0.750 \| 0.718 \| 0.955 \| 0.710 \|
	\| embedding-svm_sentence_embeddings \| 0.500 \| 0.930 \| 0.741 \| 0.562 \| 0.640 \| 0.956 \| 0.704 \|
	\| embedding-svm_sentence_embeddings \| 0.310 \| 0.934 \| 0.686 \| 0.741 \| 0.712 \| 0.956 \| 0.704 \|
	\| embedding-lightgbm_sentence_embeddings \| 0.500 \| 0.937 \| 0.740 \| 0.661 \| 0.698 \| 0.960 \| 0.791 \|
	\| embedding-lightgbm_sentence_embeddings \| 0.042 \| 0.929 \| 0.639 \| 0.821 \| 0.719 \| 0.960 \| 0.791 \|
	\| transformer \| 0.500 \| 0.939 \| 0.689 \| 0.812 \| 0.746 \| 0.968 \| 0.794 \|
	\| transformer \| 0.853 \| 0.947 \| 0.754 \| 0.768 \| 0.761 \| 0.968 \| 0.794 \|

	## Confusion Matrices on Test Split

	Rows are true labels and columns are predicted labels.

	### logistic_tfidf at threshold 0.500

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 874 \| 30 \|
	\| RELEVANT \| 45 \| 67 \|

	### logistic_tfidf at threshold 0.608

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 899 \| 5 \|
	\| RELEVANT \| 66 \| 46 \|

	### xgboost_tfidf at threshold 0.500

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 904 \| 0 \|
	\| RELEVANT \| 77 \| 35 \|

	### xgboost_tfidf at threshold 0.177

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 874 \| 30 \|
	\| RELEVANT \| 53 \| 59 \|

	### embedding-logistic_sentence_embeddings at threshold 0.500

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 806 \| 98 \|
	\| RELEVANT \| 13 \| 99 \|

	### embedding-logistic_sentence_embeddings at threshold 0.722

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 866 \| 38 \|
	\| RELEVANT \| 28 \| 84 \|

	### embedding-svm_sentence_embeddings at threshold 0.500

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 882 \| 22 \|
	\| RELEVANT \| 49 \| 63 \|

	### embedding-svm_sentence_embeddings at threshold 0.310

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 866 \| 38 \|
	\| RELEVANT \| 29 \| 83 \|

	### embedding-lightgbm_sentence_embeddings at threshold 0.500

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 878 \| 26 \|
	\| RELEVANT \| 38 \| 74 \|

	### embedding-lightgbm_sentence_embeddings at threshold 0.042

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 852 \| 52 \|
	\| RELEVANT \| 20 \| 92 \|

	### transformer at threshold 0.500

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 863 \| 41 \|
	\| RELEVANT \| 21 \| 91 \|

	### transformer at threshold 0.853

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 876 \| 28 \|
	\| RELEVANT \| 26 \| 86 \|


	## Validation-Tuned Thresholds

	- `logistic_tfidf`: threshold `0.608` (validation F1 `0.578`); test F1 change vs 0.5: `-0.077`.
	- `xgboost_tfidf`: threshold `0.177` (validation F1 `0.581`); test F1 change vs 0.5: `+0.111`.
	- `embedding-logistic_sentence_embeddings`: threshold `0.722` (validation F1 `0.753`); test F1 change vs 0.5: `+0.077`.
	- `embedding-svm_sentence_embeddings`: threshold `0.310` (validation F1 `0.747`); test F1 change vs 0.5: `+0.073`.
	- `embedding-lightgbm_sentence_embeddings`: threshold `0.042` (validation F1 `0.728`); test F1 change vs 0.5: `+0.021`.
	- `transformer`: threshold `0.853` (validation F1 `0.818`); test F1 change vs 0.5: `+0.015`.

	## Artifacts

	- `logistic_tfidf`: `/content/agri-utilization-classifier/baselines/logistic`
	- `xgboost_tfidf`: `/content/agri-utilization-classifier/baselines/xgboost`
	- `embedding-logistic_sentence_embeddings`: `/content/agri-utilization-classifier/baselines/embedding-logistic`
	- `embedding-svm_sentence_embeddings`: `/content/agri-utilization-classifier/baselines/embedding-svm`
	- `embedding-lightgbm_sentence_embeddings`: `/content/agri-utilization-classifier/baselines/embedding-lightgbm`
	- `transformer`: `/content/agri-utilization-classifier/transformer`

	# AMIS Commodity Classifier Training Report

	- Dataset: `faodl/amis-agri-utilization`
	- Dataset subset: ``
	- Dataset revision: `ada4a04088a98f8f64bc7485c57d4c7f422c2151`
	- Text column: `chunk_text`
	- Label column: `label`
	- Transformer: `FacebookAI/xlm-roberta-base`
	- Generated at: `2026-06-10T20:30:54.345579+00:00`

	## Dataset Summary

	\| Split \| Rows \| Label 0 \| Label 1 \| Unique groups \| Mean text length \|
	\| --- \| ---: \| ---: \| ---: \| ---: \| ---: \|
	\| train \| 4877 \| 4347 \| 530 \| 2513 \| 696.6 \|
	\| validation \| 978 \| 899 \| 79 \| 538 \| 690.6 \|
	\| test \| 1016 \| 904 \| 112 \| 539 \| 690.7 \|

	## Threshold Comparison on Validation Split

	Validation metrics document threshold selection and tuning behavior; test metrics remain the primary estimate of out-of-sample performance.

	\| Model \| Threshold \| Accuracy \| Precision \| Recall \| F1 \| ROC AUC \| Average precision \|
	\| --- \| ---: \| ---: \| ---: \| ---: \| ---: \| ---: \| ---: \|
	\| logistic_tfidf \| 0.500 \| 0.912 \| 0.465 \| 0.582 \| 0.517 \| 0.872 \| 0.594 \|
	\| logistic_tfidf \| 0.608 \| 0.942 \| 0.696 \| 0.494 \| 0.578 \| 0.872 \| 0.594 \|
	\| xgboost_tfidf \| 0.500 \| 0.945 \| 0.931 \| 0.342 \| 0.500 \| 0.823 \| 0.588 \|
	\| xgboost_tfidf \| 0.177 \| 0.934 \| 0.592 \| 0.570 \| 0.581 \| 0.823 \| 0.588 \|
	\| embedding-logistic_sentence_embeddings \| 0.500 \| 0.912 \| 0.476 \| 0.861 \| 0.613 \| 0.953 \| 0.762 \|
	\| embedding-logistic_sentence_embeddings \| 0.722 \| 0.957 \| 0.703 \| 0.810 \| 0.753 \| 0.953 \| 0.762 \|
	\| embedding-svm_sentence_embeddings \| 0.500 \| 0.955 \| 0.807 \| 0.582 \| 0.676 \| 0.952 \| 0.754 \|
	\| embedding-svm_sentence_embeddings \| 0.310 \| 0.957 \| 0.713 \| 0.785 \| 0.747 \| 0.952 \| 0.754 \|
	\| embedding-lightgbm_sentence_embeddings \| 0.500 \| 0.954 \| 0.750 \| 0.646 \| 0.694 \| 0.948 \| 0.782 \|
	\| embedding-lightgbm_sentence_embeddings \| 0.042 \| 0.952 \| 0.670 \| 0.797 \| 0.728 \| 0.948 \| 0.782 \|
	\| transformer \| 0.500 \| 0.964 \| 0.739 \| 0.861 \| 0.795 \| 0.970 \| 0.874 \|
	\| transformer \| 0.853 \| 0.970 \| 0.812 \| 0.823 \| 0.818 \| 0.970 \| 0.874 \|

	## Threshold Comparison on Test Split

	\| Model \| Threshold \| Accuracy \| Precision \| Recall \| F1 \| ROC AUC \| Average precision \|
	\| --- \| ---: \| ---: \| ---: \| ---: \| ---: \| ---: \| ---: \|
	\| logistic_tfidf \| 0.500 \| 0.926 \| 0.691 \| 0.598 \| 0.641 \| 0.899 \| 0.726 \|
	\| logistic_tfidf \| 0.608 \| 0.930 \| 0.902 \| 0.411 \| 0.564 \| 0.899 \| 0.726 \|
	\| xgboost_tfidf \| 0.500 \| 0.924 \| 1.000 \| 0.312 \| 0.476 \| 0.892 \| 0.692 \|
	\| xgboost_tfidf \| 0.177 \| 0.918 \| 0.663 \| 0.527 \| 0.587 \| 0.892 \| 0.692 \|
	\| embedding-logistic_sentence_embeddings \| 0.500 \| 0.891 \| 0.503 \| 0.884 \| 0.641 \| 0.955 \| 0.710 \|
	\| embedding-logistic_sentence_embeddings \| 0.722 \| 0.935 \| 0.689 \| 0.750 \| 0.718 \| 0.955 \| 0.710 \|
	\| embedding-svm_sentence_embeddings \| 0.500 \| 0.930 \| 0.741 \| 0.562 \| 0.640 \| 0.956 \| 0.704 \|
	\| embedding-svm_sentence_embeddings \| 0.310 \| 0.934 \| 0.686 \| 0.741 \| 0.712 \| 0.956 \| 0.704 \|
	\| embedding-lightgbm_sentence_embeddings \| 0.500 \| 0.937 \| 0.740 \| 0.661 \| 0.698 \| 0.960 \| 0.791 \|
	\| embedding-lightgbm_sentence_embeddings \| 0.042 \| 0.929 \| 0.639 \| 0.821 \| 0.719 \| 0.960 \| 0.791 \|
	\| transformer \| 0.500 \| 0.939 \| 0.689 \| 0.812 \| 0.746 \| 0.968 \| 0.794 \|
	\| transformer \| 0.853 \| 0.947 \| 0.754 \| 0.768 \| 0.761 \| 0.968 \| 0.794 \|

	## Confusion Matrices on Test Split

	Rows are true labels and columns are predicted labels.

	### logistic_tfidf at threshold 0.500

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 874 \| 30 \|
	\| RELEVANT \| 45 \| 67 \|

	### logistic_tfidf at threshold 0.608

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 899 \| 5 \|
	\| RELEVANT \| 66 \| 46 \|

	### xgboost_tfidf at threshold 0.500

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 904 \| 0 \|
	\| RELEVANT \| 77 \| 35 \|

	### xgboost_tfidf at threshold 0.177

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 874 \| 30 \|
	\| RELEVANT \| 53 \| 59 \|

	### embedding-logistic_sentence_embeddings at threshold 0.500

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 806 \| 98 \|
	\| RELEVANT \| 13 \| 99 \|

	### embedding-logistic_sentence_embeddings at threshold 0.722

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 866 \| 38 \|
	\| RELEVANT \| 28 \| 84 \|

	### embedding-svm_sentence_embeddings at threshold 0.500

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 882 \| 22 \|
	\| RELEVANT \| 49 \| 63 \|

	### embedding-svm_sentence_embeddings at threshold 0.310

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 866 \| 38 \|
	\| RELEVANT \| 29 \| 83 \|

	### embedding-lightgbm_sentence_embeddings at threshold 0.500

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 878 \| 26 \|
	\| RELEVANT \| 38 \| 74 \|

	### embedding-lightgbm_sentence_embeddings at threshold 0.042

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 852 \| 52 \|
	\| RELEVANT \| 20 \| 92 \|

	### transformer at threshold 0.500

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 863 \| 41 \|
	\| RELEVANT \| 21 \| 91 \|

	### transformer at threshold 0.853

	\| True / Predicted \| NOT_RELEVANT \| RELEVANT \|
	\| --- \| ---: \| ---: \|
	\| NOT_RELEVANT \| 876 \| 28 \|
	\| RELEVANT \| 26 \| 86 \|


	## Validation-Tuned Thresholds

	- `logistic_tfidf`: threshold `0.608` (validation F1 `0.578`); test F1 change vs 0.5: `-0.077`.
	- `xgboost_tfidf`: threshold `0.177` (validation F1 `0.581`); test F1 change vs 0.5: `+0.111`.
	- `embedding-logistic_sentence_embeddings`: threshold `0.722` (validation F1 `0.753`); test F1 change vs 0.5: `+0.077`.
	- `embedding-svm_sentence_embeddings`: threshold `0.310` (validation F1 `0.747`); test F1 change vs 0.5: `+0.073`.
	- `embedding-lightgbm_sentence_embeddings`: threshold `0.042` (validation F1 `0.728`); test F1 change vs 0.5: `+0.021`.
	- `transformer`: threshold `0.853` (validation F1 `0.818`); test F1 change vs 0.5: `+0.015`.

	## Artifacts

	- `logistic_tfidf`: `/content/agri-utilization-classifier/baselines/logistic`
	- `xgboost_tfidf`: `/content/agri-utilization-classifier/baselines/xgboost`
	- `embedding-logistic_sentence_embeddings`: `/content/agri-utilization-classifier/baselines/embedding-logistic`
	- `embedding-svm_sentence_embeddings`: `/content/agri-utilization-classifier/baselines/embedding-svm`
	- `embedding-lightgbm_sentence_embeddings`: `/content/agri-utilization-classifier/baselines/embedding-lightgbm`
	- `transformer`: `/content/agri-utilization-classifier/transformer`