File size: 6,297 Bytes

---
language:
- en
pipeline_tag: image-classification
license: mit
metrics:
- accuracy
library_name: keras
tags:
- biology
---
# 🌱 Plant Seedlings Classification — AI for Smarter Agriculture  

![Model1_Thumbnail](https://cdn-uploads.huggingface.co/production/uploads/68faf2c58b9b8d06b47b769c/BF9Vrz9q__J0tNIotUuhQ.png)

## 🧩 Overview  
Agriculture remains one of the most vital yet labor-intensive industries. Farmers and agronomists often spend countless hours identifying seedlings, weeds, and crop health manually.  
This model leverages **Deep Learning and Computer Vision** to automatically recognize plant species from seedling images, helping accelerate early-stage crop monitoring and enabling precision agriculture.  

**Objective**  
The aim of this project is to build a **Convolutional Neural Network (CNN)** to classify plant seedlings into their respective categories,  
reducing manual work and enabling scalable, automated monitoring.

---

## 🤖 Model Details  
- **Model Type:** CNN-based Image Classifier  
- **Framework:** TensorFlow / Keras  
- **Dataset:** Aarhus Plant Seedlings Dataset *(12 plant species)*  
- **Data Files:** `images.npy` (image array), `Labels.csv` (species labels)  
- **Input:** RGB image of a seedling (128 × 128)  
- **Output:** Predicted plant species label  

---

## 📊 Data Dictionary  
| Feature / File        | Description                                                                 |
|-----------------------|-----------------------------------------------------------------------------|
| `images.npy`          | Numpy array containing all seedling images                                  |
| `Labels.csv`          | CSV file containing labels for each image                                   |
| `label`               | Column specifying plant species name                                        |
| Image Shape           | 128 × 128 × 3 RGB                                                           |
| Classes               | 12 species                                                                  |

---

## 🌿 List of Species  
- Black-grass  
- Charlock  
- Cleavers  
- Common Chickweed  
- Common Wheat  
- Fat Hen  
- Loose Silky-bent  
- Maize  
- Scentless Mayweed  
- Shepherds Purse  
- Small-flowered Cranesbill  
- Sugar beet

---

## 🌿 Problem Context  
In recent times, the field of agriculture has been in urgent need of modernizing, since the amount of manual work people need to put in to check if plants are growing correctly is still highly extensive.  
Despite several advances in agricultural technology, people working in the agricultural industry still need to have the ability to sort and recognize different plants and weeds, which takes a lot of time and effort in the long term.  

The potential is ripe for this trillion-dollar industry to be greatly impacted by technological innovations that cut down on the requirement for manual labor. This is where **Artificial Intelligence** can truly benefit the sector.  
Faster and more accurate seedling identification can lead to better crop yields, allow workers to focus on higher-order decision-making, and promote more sustainable agricultural practices.

---

## 🌟 Why It Matters  
✅ Reduces manual effort in plant identification  
✅ Boosts accuracy and consistency in seedling detection  
✅ Supports sustainable, data-driven agriculture  
✅ Enables automation in large-scale farming and greenhouse monitoring  

📘 **Full Source Notebook:**  
The complete training and evaluation notebook is available on GitHub:  
👉 [View on GitHub](https://github.com/joyjitroy/Machine_Learning/blob/main/Plant_Seeding_Classification_using_CNN.ipynb)

📘 Citation
If you use this model or reference it in your research, please cite:

Joyjit, R. (2024). Plant Seedlings Classification — AI for Smarter Agriculture. Zenodo. https://doi.org/10.5281/zenodo.17586747
---

## 🚀 Example Usage  
```python
# Inference example aligned to your dataset schema
# Files: Data/images.npy (H, W, 3) images, Data/Labels.csv (columns: filename,label)

import numpy as np
import pandas as pd
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing import image
from pathlib import Path

# ---- Paths & constants (match your training setup) ----
IMAGES_NPY = "Data/images.npy"     # preprocessed array of images
LABELS_CSV = "Data/Labels.csv"     # has a 'label' column with species names
MODEL_PATH = "plant_seedlings_cnn.keras"  # or "plant_seedlings_model.h5"
IMAGE_SIZE = (128, 128)            # use the SAME size you trained with
SCALE = 1.0 / 255.0                # same normalization as training

# ---- Load model and class names (from Labels.csv) ----
model = load_model(MODEL_PATH)

labels_df = pd.read_csv(LABELS_CSV)
# Get stable, reproducible class order (sorted unique labels)
class_names = sorted(labels_df["label"].unique())
class_to_idx = {c: i for i, c in enumerate(class_names)}
idx_to_class = {i: c for c, i in class_to_idx.items()}

# ---------- Option A: Predict on an external image file ----------
def predict_image(img_path: str) -> tuple[str, float]:
    """Predict species from a raw image file."""
    img = image.load_img(img_path, target_size=IMAGE_SIZE)
    x = image.img_to_array(img) * SCALE
    x = np.expand_dims(x, axis=0)  # (1, H, W, 3)
    prob = model.predict(x, verbose=0)[0]         # shape: (num_classes,)
    pred_idx = int(np.argmax(prob))
    return idx_to_class[pred_idx], float(prob[pred_idx])

species, conf = predict_image("seedling.jpg")
print(f"Predicted species: {species}  |  confidence: {conf:.4f}")

# ---------- Option B: Predict from your prepacked dataset ----------
# Assumes images.npy already contains images at IMAGE_SIZE and same scaling is required.
# If images.npy is raw uint8, multiply by SCALE; if already scaled, set SCALE=1.0 above.
images = np.load(IMAGES_NPY)       # expected shape: (N, H, W, 3)
if images.max() > 1.0:
    images = images * SCALE

# Example: predict the first sample in the npy file
sample = np.expand_dims(images[0], axis=0)        # (1, H, W, 3)
prob = model.predict(sample, verbose=0)[0]
pred_idx = int(np.argmax(prob))
print(f"[Dataset sample 0] Predicted: {idx_to_class[pred_idx]}  |  confidence: {prob[pred_idx]:.4f}")