BirendraSharma commited on
Commit
9992228
Β·
verified Β·
1 Parent(s): 98d6cc8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +210 -3
README.md CHANGED
@@ -1,3 +1,210 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ tags:
4
+ - crop-yield
5
+ - agriculture
6
+ - regression
7
+ - classification
8
+ - xgboost
9
+ - tabular
10
+ license: mit
11
+ datasets:
12
+ - fao
13
+ pipeline_tag: tabular-regression
14
+ ---
15
+
16
+ # 🌾 CropYQ β€” Crop Yield & Quality Prediction Models
17
+
18
+ **Repository:** `BirendraSharma/cropyq`
19
+
20
+ This repository hosts two trained machine learning models for predicting agricultural crop yield and quality across India, Nepal, and the Netherlands, based on FAO crop production data.
21
+
22
+ ---
23
+
24
+ ## πŸ“¦ Models Included
25
+
26
+ | File | Type | Description |
27
+ |------|------|-------------|
28
+ | `regression_model.pkl` | Scikit-learn Regressor | Predicts crop yield in **kg/ha** (log-transformed target, inverse-transformed on output) |
29
+ | `xgboostClassification_model.pkl` | XGBoost Classifier | Predicts crop quality as **Low / Medium / High** |
30
+
31
+ ---
32
+
33
+ ## πŸ—‚οΈ Input Features
34
+
35
+ Both models share the same 5-feature input vector:
36
+
37
+ | Feature | Type | Description |
38
+ |---------|------|-------------|
39
+ | `Area` | Encoded int | Country (India=0, Nepal=1, Netherlands=2) |
40
+ | `Item` | Encoded int | Crop type (38 categories, e.g. Wheat=36, Rice=27) |
41
+ | `Crop Group` | Encoded int | Cereal=0, Fruit=1, Oilseed=2, Pulse=3, Root=4, Vegetable=5 |
42
+ | `Flag` | Encoded int | FAO data flag β€” A=0, E=1 |
43
+ | `Year` | int | Year offset from 1961 (e.g. 2020 β†’ 59) |
44
+
45
+ ---
46
+
47
+ ## 🌍 Supported Areas
48
+
49
+ - India
50
+ - Nepal
51
+ - Netherlands (Kingdom of the)
52
+
53
+ ---
54
+
55
+ ## 🌱 Supported Crops (38 total)
56
+
57
+ Apples, Bananas, Barley, Beans (dry), Broad beans, Cabbages, Carrots & turnips, Cassava, Cauliflowers & broccoli, Chick peas, Chillies & peppers, Eggplants, Grapes, Groundnuts, Lentils, Linseed, Maize (corn), Mangoes, Millet, Mustard seed, Oats, Onions & shallots, Oranges, Peas (dry), Pigeon peas, Potatoes, Rape/colza seed, Rice, Rye, Sesame seed, Sorghum, Soya beans, Sunflower seed, Sweet potatoes, Tomatoes, Triticale, Wheat, Yams
58
+
59
+ ---
60
+
61
+ ## πŸš€ Quickstart
62
+
63
+ ### Install dependencies
64
+
65
+ ```bash
66
+ pip install huggingface_hub scikit-learn xgboost numpy
67
+ ```
68
+
69
+ ### Load and use the models
70
+
71
+ ```python
72
+ import pickle
73
+ import numpy as np
74
+ from huggingface_hub import hf_hub_download
75
+
76
+ REPO_ID = "BirendraSharma/cropyq"
77
+
78
+ # Download and load regression model
79
+ reg_path = hf_hub_download(repo_id=REPO_ID, filename="regression_model.pkl")
80
+ with open(reg_path, "rb") as f:
81
+ reg_model = pickle.load(f)
82
+
83
+ # Download and load classification model
84
+ clf_path = hf_hub_download(repo_id=REPO_ID, filename="xgboostClassification_model.pkl")
85
+ with open(clf_path, "rb") as f:
86
+ clf_model = pickle.load(f)
87
+
88
+ # Example: Wheat in India, Cereal group, Flag A, Year 2020
89
+ # area=0 (India), item=36 (Wheat), cropgroup=0 (Cereal), flag=0 (A), year=2020-1961=59
90
+ inputs = np.array([[0, 36, 0, 0, 59]], dtype=np.float32)
91
+
92
+ # Predict yield (kg/ha) β€” model was trained on log1p target
93
+ log_yield = reg_model.predict(inputs)[0]
94
+ yield_kgha = np.expm1(log_yield)
95
+ print(f"Predicted Yield: {yield_kgha:.2f} kg/ha")
96
+
97
+ # Predict quality
98
+ quality_map = {0: "Low", 1: "Medium", 2: "High"}
99
+ quality_pred = clf_model.predict(inputs)[0]
100
+ print(f"Predicted Quality: {quality_map[int(quality_pred)]}")
101
+ ```
102
+
103
+ ---
104
+
105
+ ## πŸ–₯️ Desktop GUI App
106
+
107
+ A Tkinter-based desktop app is available that provides a point-and-click interface for running predictions.
108
+
109
+ ### Run the app
110
+
111
+ ```bash
112
+ pip install huggingface_hub scikit-learn xgboost numpy tkinter
113
+ python crop_yield_app.py
114
+ ```
115
+
116
+ The app will automatically download both model files from this repository on first launch.
117
+
118
+ **Features:**
119
+ - Dropdown selectors for Area, Item, Crop Group, and Flag
120
+ - Text entry for Year
121
+ - **Predict Yield** button β†’ returns estimated kg/ha
122
+ - **Predict Quality** button β†’ returns Low / Medium / High
123
+
124
+ ---
125
+
126
+ ## πŸ”’ Encoding Reference
127
+
128
+ <details>
129
+ <summary>Area Encoding</summary>
130
+
131
+ | Area | Code |
132
+ |------|------|
133
+ | India | 0 |
134
+ | Nepal | 1 |
135
+ | Netherlands (Kingdom of the) | 2 |
136
+
137
+ </details>
138
+
139
+ <details>
140
+ <summary>Crop Group Encoding</summary>
141
+
142
+ | Crop Group | Code |
143
+ |------------|------|
144
+ | Cereal | 0 |
145
+ | Fruit | 1 |
146
+ | Oilseed | 2 |
147
+ | Pulse | 3 |
148
+ | Root | 4 |
149
+ | Vegetable | 5 |
150
+
151
+ </details>
152
+
153
+ <details>
154
+ <summary>Flag Encoding</summary>
155
+
156
+ | Flag | Code | Meaning |
157
+ |------|------|---------|
158
+ | A | 0 | Official figure |
159
+ | E | 1 | Estimated value |
160
+
161
+ </details>
162
+
163
+ <details>
164
+ <summary>Year Encoding</summary>
165
+
166
+ Year values are offset from 1961:
167
+
168
+ ```
169
+ encoded_year = actual_year - 1961
170
+ # e.g. 2020 β†’ 59, 1990 β†’ 29, 1961 β†’ 0
171
+ ```
172
+
173
+ </details>
174
+
175
+ ---
176
+
177
+ ## πŸ“Š Model Details
178
+
179
+ ### Regression Model (`regression_model.pkl`)
180
+ - **Task:** Tabular regression
181
+ - **Target:** Log-transformed crop yield (`log1p(kg/ha)`), back-transformed with `expm1` at inference
182
+ - **Output:** Yield in kg/ha
183
+
184
+ ### Classification Model (`xgboostClassification_model.pkl`)
185
+ - **Task:** Multi-class tabular classification
186
+ - **Framework:** XGBoost
187
+ - **Output classes:** Low (0), Medium (1), High (2)
188
+
189
+ ---
190
+
191
+ ## πŸ“ Repository Structure
192
+
193
+ ```
194
+ BirendraSharma/cropyq/
195
+ β”œβ”€β”€ regression_model.pkl # Sklearn regression model
196
+ β”œβ”€β”€ xgboostClassification_model.pkl # XGBoost classification model
197
+ └── README.md # This file
198
+ ```
199
+
200
+ ---
201
+
202
+ ## πŸ“œ License
203
+
204
+ This project is licensed under the [MIT License](https://opensource.org/licenses/MIT).
205
+
206
+ ---
207
+
208
+ ## πŸ™ Acknowledgements
209
+
210
+ Data sourced from the [FAO (Food and Agriculture Organization of the United Nations)](https://www.fao.org/faostat/) crop production statistics.