language: en
tags:
- crop-yield
- agriculture
- regression
- classification
- xgboost
- tabular
license: mit
datasets:
- fao
pipeline_tag: tabular-regression
πΎ CropYQ β Crop Yield & Quality Prediction Models
Repository: BirendraSharma/cropyq
This repository hosts two trained machine learning models for predicting agricultural crop yield and quality across India, Nepal, and the Netherlands, based on FAO crop production data.
π¦ Models Included
| File | Type | Description |
|---|---|---|
regression_model.pkl |
Scikit-learn Regressor | Predicts crop yield in kg/ha (log-transformed target, inverse-transformed on output) |
xgboostClassification_model.pkl |
XGBoost Classifier | Predicts crop quality as Low / Medium / High |
ποΈ Input Features
Both models share the same 5-feature input vector:
| Feature | Type | Description |
|---|---|---|
Area |
Encoded int | Country (India=0, Nepal=1, Netherlands=2) |
Item |
Encoded int | Crop type (38 categories, e.g. Wheat=36, Rice=27) |
Crop Group |
Encoded int | Cereal=0, Fruit=1, Oilseed=2, Pulse=3, Root=4, Vegetable=5 |
Flag |
Encoded int | FAO data flag β A=0, E=1 |
Year |
int | Year offset from 1961 (e.g. 2020 β 59) |
π Supported Areas
- India
- Nepal
- Netherlands (Kingdom of the)
π± Supported Crops (38 total)
Apples, Bananas, Barley, Beans (dry), Broad beans, Cabbages, Carrots & turnips, Cassava, Cauliflowers & broccoli, Chick peas, Chillies & peppers, Eggplants, Grapes, Groundnuts, Lentils, Linseed, Maize (corn), Mangoes, Millet, Mustard seed, Oats, Onions & shallots, Oranges, Peas (dry), Pigeon peas, Potatoes, Rape/colza seed, Rice, Rye, Sesame seed, Sorghum, Soya beans, Sunflower seed, Sweet potatoes, Tomatoes, Triticale, Wheat, Yams
π Quickstart
Install dependencies
pip install huggingface_hub scikit-learn xgboost numpy
Load and use the models
import pickle
import numpy as np
from huggingface_hub import hf_hub_download
REPO_ID = "BirendraSharma/cropyq"
# Download and load regression model
reg_path = hf_hub_download(repo_id=REPO_ID, filename="regression_model.pkl")
with open(reg_path, "rb") as f:
reg_model = pickle.load(f)
# Download and load classification model
clf_path = hf_hub_download(repo_id=REPO_ID, filename="xgboostClassification_model.pkl")
with open(clf_path, "rb") as f:
clf_model = pickle.load(f)
# Example: Wheat in India, Cereal group, Flag A, Year 2020
# area=0 (India), item=36 (Wheat), cropgroup=0 (Cereal), flag=0 (A), year=2020-1961=59
inputs = np.array([[0, 36, 0, 0, 59]], dtype=np.float32)
# Predict yield (kg/ha) β model was trained on log1p target
log_yield = reg_model.predict(inputs)[0]
yield_kgha = np.expm1(log_yield)
print(f"Predicted Yield: {yield_kgha:.2f} kg/ha")
# Predict quality
quality_map = {0: "Low", 1: "Medium", 2: "High"}
quality_pred = clf_model.predict(inputs)[0]
print(f"Predicted Quality: {quality_map[int(quality_pred)]}")
π₯οΈ Desktop GUI App
A Tkinter-based desktop app is available that provides a point-and-click interface for running predictions.
Run the app
pip install huggingface_hub scikit-learn xgboost numpy tkinter
python crop_yield_app.py
The app will automatically download both model files from this repository on first launch.
Features:
- Dropdown selectors for Area, Item, Crop Group, and Flag
- Text entry for Year
- Predict Yield button β returns estimated kg/ha
- Predict Quality button β returns Low / Medium / High
π’ Encoding Reference
Area Encoding
| Area | Code |
|---|---|
| India | 0 |
| Nepal | 1 |
| Netherlands (Kingdom of the) | 2 |
Crop Group Encoding
| Crop Group | Code |
|---|---|
| Cereal | 0 |
| Fruit | 1 |
| Oilseed | 2 |
| Pulse | 3 |
| Root | 4 |
| Vegetable | 5 |
Flag Encoding
| Flag | Code | Meaning |
|---|---|---|
| A | 0 | Official figure |
| E | 1 | Estimated value |
Year Encoding
Year values are offset from 1961:
encoded_year = actual_year - 1961
# e.g. 2020 β 59, 1990 β 29, 1961 β 0
π Model Details
Regression Model (regression_model.pkl)
- Task: Tabular regression
- Target: Log-transformed crop yield (
log1p(kg/ha)), back-transformed withexpm1at inference - Output: Yield in kg/ha
Classification Model (xgboostClassification_model.pkl)
- Task: Multi-class tabular classification
- Framework: XGBoost
- Output classes: Low (0), Medium (1), High (2)
π Repository Structure
BirendraSharma/cropyq/
βββ regression_model.pkl # Sklearn regression model
βββ xgboostClassification_model.pkl # XGBoost classification model
βββ README.md # This file
π License
This project is licensed under the MIT License.
π Acknowledgements
Data sourced from the FAO (Food and Agriculture Organization of the United Nations) crop production statistics.