File size: 6,297 Bytes
a3ee830 112a70a a3ee830 8b27a61 6bfbc0e 8b27a61 6bfbc0e 8b27a61 9643318 8b27a61 9643318 8b27a61 9643318 8b27a61 a0bda43 9643318 44d42f9 a0bda43 d577ec6 8b27a61 9643318 8b27a61 9643318 8b27a61 9643318 8b27a61 9643318 8b27a61 9643318 731017e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | ---
language:
- en
pipeline_tag: image-classification
license: mit
metrics:
- accuracy
library_name: keras
tags:
- biology
---
# π± Plant Seedlings Classification β AI for Smarter Agriculture

## π§© Overview
Agriculture remains one of the most vital yet labor-intensive industries. Farmers and agronomists often spend countless hours identifying seedlings, weeds, and crop health manually.
This model leverages **Deep Learning and Computer Vision** to automatically recognize plant species from seedling images, helping accelerate early-stage crop monitoring and enabling precision agriculture.
**Objective**
The aim of this project is to build a **Convolutional Neural Network (CNN)** to classify plant seedlings into their respective categories,
reducing manual work and enabling scalable, automated monitoring.
---
## π€ Model Details
- **Model Type:** CNN-based Image Classifier
- **Framework:** TensorFlow / Keras
- **Dataset:** Aarhus Plant Seedlings Dataset *(12 plant species)*
- **Data Files:** `images.npy` (image array), `Labels.csv` (species labels)
- **Input:** RGB image of a seedling (128 Γ 128)
- **Output:** Predicted plant species label
---
## π Data Dictionary
| Feature / File | Description |
|-----------------------|-----------------------------------------------------------------------------|
| `images.npy` | Numpy array containing all seedling images |
| `Labels.csv` | CSV file containing labels for each image |
| `label` | Column specifying plant species name |
| Image Shape | 128 Γ 128 Γ 3 RGB |
| Classes | 12 species |
---
## πΏ List of Species
- Black-grass
- Charlock
- Cleavers
- Common Chickweed
- Common Wheat
- Fat Hen
- Loose Silky-bent
- Maize
- Scentless Mayweed
- Shepherds Purse
- Small-flowered Cranesbill
- Sugar beet
---
## πΏ Problem Context
In recent times, the field of agriculture has been in urgent need of modernizing, since the amount of manual work people need to put in to check if plants are growing correctly is still highly extensive.
Despite several advances in agricultural technology, people working in the agricultural industry still need to have the ability to sort and recognize different plants and weeds, which takes a lot of time and effort in the long term.
The potential is ripe for this trillion-dollar industry to be greatly impacted by technological innovations that cut down on the requirement for manual labor. This is where **Artificial Intelligence** can truly benefit the sector.
Faster and more accurate seedling identification can lead to better crop yields, allow workers to focus on higher-order decision-making, and promote more sustainable agricultural practices.
---
## π Why It Matters
β
Reduces manual effort in plant identification
β
Boosts accuracy and consistency in seedling detection
β
Supports sustainable, data-driven agriculture
β
Enables automation in large-scale farming and greenhouse monitoring
π **Full Source Notebook:**
The complete training and evaluation notebook is available on GitHub:
π [View on GitHub](https://github.com/joyjitroy/Machine_Learning/blob/main/Plant_Seeding_Classification_using_CNN.ipynb)
π Citation
If you use this model or reference it in your research, please cite:
Joyjit, R. (2024). Plant Seedlings Classification β AI for Smarter Agriculture. Zenodo. https://doi.org/10.5281/zenodo.17586747
---
## π Example Usage
```python
# Inference example aligned to your dataset schema
# Files: Data/images.npy (H, W, 3) images, Data/Labels.csv (columns: filename,label)
import numpy as np
import pandas as pd
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing import image
from pathlib import Path
# ---- Paths & constants (match your training setup) ----
IMAGES_NPY = "Data/images.npy" # preprocessed array of images
LABELS_CSV = "Data/Labels.csv" # has a 'label' column with species names
MODEL_PATH = "plant_seedlings_cnn.keras" # or "plant_seedlings_model.h5"
IMAGE_SIZE = (128, 128) # use the SAME size you trained with
SCALE = 1.0 / 255.0 # same normalization as training
# ---- Load model and class names (from Labels.csv) ----
model = load_model(MODEL_PATH)
labels_df = pd.read_csv(LABELS_CSV)
# Get stable, reproducible class order (sorted unique labels)
class_names = sorted(labels_df["label"].unique())
class_to_idx = {c: i for i, c in enumerate(class_names)}
idx_to_class = {i: c for c, i in class_to_idx.items()}
# ---------- Option A: Predict on an external image file ----------
def predict_image(img_path: str) -> tuple[str, float]:
"""Predict species from a raw image file."""
img = image.load_img(img_path, target_size=IMAGE_SIZE)
x = image.img_to_array(img) * SCALE
x = np.expand_dims(x, axis=0) # (1, H, W, 3)
prob = model.predict(x, verbose=0)[0] # shape: (num_classes,)
pred_idx = int(np.argmax(prob))
return idx_to_class[pred_idx], float(prob[pred_idx])
species, conf = predict_image("seedling.jpg")
print(f"Predicted species: {species} | confidence: {conf:.4f}")
# ---------- Option B: Predict from your prepacked dataset ----------
# Assumes images.npy already contains images at IMAGE_SIZE and same scaling is required.
# If images.npy is raw uint8, multiply by SCALE; if already scaled, set SCALE=1.0 above.
images = np.load(IMAGES_NPY) # expected shape: (N, H, W, 3)
if images.max() > 1.0:
images = images * SCALE
# Example: predict the first sample in the npy file
sample = np.expand_dims(images[0], axis=0) # (1, H, W, 3)
prob = model.predict(sample, verbose=0)[0]
pred_idx = int(np.argmax(prob))
print(f"[Dataset sample 0] Predicted: {idx_to_class[pred_idx]} | confidence: {prob[pred_idx]:.4f}") |