cropyq / README.md

BirendraSharma

Update README.md

9992228 verified 1 day ago

preview code

raw

history blame contribute delete

5.35 kB

metadata

language: en
tags:
  - crop-yield
  - agriculture
  - regression
  - classification
  - xgboost
  - tabular
license: mit
datasets:
  - fao
pipeline_tag: tabular-regression

🌾 CropYQ — Crop Yield & Quality Prediction Models

Repository: BirendraSharma/cropyq

This repository hosts two trained machine learning models for predicting agricultural crop yield and quality across India, Nepal, and the Netherlands, based on FAO crop production data.

📦 Models Included

File	Type	Description
`regression_model.pkl`	Scikit-learn Regressor	Predicts crop yield in kg/ha (log-transformed target, inverse-transformed on output)
`xgboostClassification_model.pkl`	XGBoost Classifier	Predicts crop quality as Low / Medium / High

🗂️ Input Features

Both models share the same 5-feature input vector:

Feature	Type	Description
`Area`	Encoded int	Country (India=0, Nepal=1, Netherlands=2)
`Item`	Encoded int	Crop type (38 categories, e.g. Wheat=36, Rice=27)
`Crop Group`	Encoded int	Cereal=0, Fruit=1, Oilseed=2, Pulse=3, Root=4, Vegetable=5
`Flag`	Encoded int	FAO data flag — A=0, E=1
`Year`	int	Year offset from 1961 (e.g. 2020 → 59)

🌍 Supported Areas

India
Nepal
Netherlands (Kingdom of the)

🌱 Supported Crops (38 total)

Apples, Bananas, Barley, Beans (dry), Broad beans, Cabbages, Carrots & turnips, Cassava, Cauliflowers & broccoli, Chick peas, Chillies & peppers, Eggplants, Grapes, Groundnuts, Lentils, Linseed, Maize (corn), Mangoes, Millet, Mustard seed, Oats, Onions & shallots, Oranges, Peas (dry), Pigeon peas, Potatoes, Rape/colza seed, Rice, Rye, Sesame seed, Sorghum, Soya beans, Sunflower seed, Sweet potatoes, Tomatoes, Triticale, Wheat, Yams

🚀 Quickstart

Install dependencies

pip install huggingface_hub scikit-learn xgboost numpy

Load and use the models

import pickle
import numpy as np
from huggingface_hub import hf_hub_download

REPO_ID = "BirendraSharma/cropyq"

# Download and load regression model
reg_path = hf_hub_download(repo_id=REPO_ID, filename="regression_model.pkl")
with open(reg_path, "rb") as f:
    reg_model = pickle.load(f)

# Download and load classification model
clf_path = hf_hub_download(repo_id=REPO_ID, filename="xgboostClassification_model.pkl")
with open(clf_path, "rb") as f:
    clf_model = pickle.load(f)

# Example: Wheat in India, Cereal group, Flag A, Year 2020
# area=0 (India), item=36 (Wheat), cropgroup=0 (Cereal), flag=0 (A), year=2020-1961=59
inputs = np.array([[0, 36, 0, 0, 59]], dtype=np.float32)

# Predict yield (kg/ha) — model was trained on log1p target
log_yield = reg_model.predict(inputs)[0]
yield_kgha = np.expm1(log_yield)
print(f"Predicted Yield: {yield_kgha:.2f} kg/ha")

# Predict quality
quality_map = {0: "Low", 1: "Medium", 2: "High"}
quality_pred = clf_model.predict(inputs)[0]
print(f"Predicted Quality: {quality_map[int(quality_pred)]}")

🖥️ Desktop GUI App

A Tkinter-based desktop app is available that provides a point-and-click interface for running predictions.

Run the app

pip install huggingface_hub scikit-learn xgboost numpy tkinter
python crop_yield_app.py

The app will automatically download both model files from this repository on first launch.

Features:

Dropdown selectors for Area, Item, Crop Group, and Flag
Text entry for Year
Predict Yield button → returns estimated kg/ha
Predict Quality button → returns Low / Medium / High

🔢 Encoding Reference

Area Encoding

Area	Code
India	0
Nepal	1
Netherlands (Kingdom of the)	2

Crop Group Encoding

Crop Group	Code
Cereal	0
Fruit	1
Oilseed	2
Pulse	3
Root	4
Vegetable	5

Flag Encoding

Flag	Code	Meaning
A	0	Official figure
E	1	Estimated value

Year Encoding

Year values are offset from 1961:

encoded_year = actual_year - 1961
# e.g. 2020 → 59,  1990 → 29,  1961 → 0

📊 Model Details

Regression Model (`regression_model.pkl`)

Task: Tabular regression
Target: Log-transformed crop yield (log1p(kg/ha)), back-transformed with expm1 at inference
Output: Yield in kg/ha

Classification Model (`xgboostClassification_model.pkl`)

Task: Multi-class tabular classification
Framework: XGBoost
Output classes: Low (0), Medium (1), High (2)

📁 Repository Structure

BirendraSharma/cropyq/
├── regression_model.pkl             # Sklearn regression model
├── xgboostClassification_model.pkl  # XGBoost classification model
└── README.md                        # This file

📜 License

This project is licensed under the MIT License.

🙏 Acknowledgements

Data sourced from the FAO (Food and Agriculture Organization of the United Nations) crop production statistics.