crop_ai_diseases / README.md
vivek12coder's picture
Upload 20960 files
c8df794 verified
metadata
title: Crop Disease Detection AI
emoji: 🌱
colorFrom: green
colorTo: yellow
sdk: docker
app_port: 7860
python_version: 3.1
suggested_hardware: cpu-basic
suggested_storage: small
license: apache-2.0
tags:
  - computer-vision
  - agriculture
  - disease-detection
  - fastapi
  - pytorch
  - gradcam
  - ai
  - deep-learning
  - crop-monitoring

Crop Disease Detection AI πŸŒ±πŸ”

Advanced Computer Vision System for Agricultural Disease Detection

This folder contains a state-of-the-art PyTorch-based deep learning system for detecting diseases in crop images using ResNet50 architecture with comprehensive visual explanations and real-time risk assessment.

πŸš€ Key Features

  • Multi-Crop Disease Detection: Supports Pepper (Bell), Potato, and Tomato crops
  • 15 Disease Classes: Comprehensive coverage of common agricultural diseases
  • Visual AI Explanations: Grad-CAM and LIME explanations for prediction transparency
  • FastAPI Backend: High-performance RESTful API with real-time predictions
  • High Accuracy: 90.09% test accuracy on validation dataset (v3.0 model)
  • Risk Assessment: Automated severity scoring and treatment recommendations
  • Memory Optimized: Multiple model variants for different deployment scenarios
  • Production Ready: Docker support, comprehensive testing, and monitoring

🧠 AI Model Architecture

Core Model: Enhanced ResNet50

  • Base Architecture: Pre-trained ResNet50 on ImageNet with custom classifier head
  • Fine-tuning: Specialized transfer learning for agricultural disease detection
  • Input Specifications: 224x224 RGB images, normalized with ImageNet statistics
  • Output: 15-class disease classification with confidence scores
  • Model Depth: 50 layers with residual connections for stable training

Advanced Architecture Details

ResNet50 Feature Extractor (frozen/unfrozen)
β”œβ”€β”€ Custom Classifier Head:
β”‚   β”œβ”€β”€ Dropout(0.5)
β”‚   β”œβ”€β”€ Linear(2048 β†’ 1024) + BatchNorm + ReLU
β”‚   β”œβ”€β”€ Dropout(0.3)
β”‚   β”œβ”€β”€ Linear(1024 β†’ 512) + BatchNorm + ReLU
β”‚   β”œβ”€β”€ Dropout(0.2)
β”‚   └── Linear(512 β†’ 15) [Output Layer]

Model Versions & Performance

  • v3.0 (Current): Retrained ResNet50 - 90.09% test accuracy
  • v2.0: Enhanced feature extraction - 87.5% accuracy
  • v1.0: Initial baseline model - 85.2% accuracy
  • Lite Variants: Memory-optimized models for edge deployment

πŸ“Š Supported Disease Classes

Pepper (Bell) - 2 Classes

  1. Bacterial Spot - Xanthomonas infection
  2. Healthy - No disease detected

Potato - 3 Classes

  1. Early Blight - Alternaria solani
  2. Late Blight - Phytophthora infestans
  3. Healthy - No disease detected

Tomato - 10 Classes

  1. Bacterial Spot - Xanthomonas perforans
  2. Early Blight - Alternaria solani
  3. Late Blight - Phytophthora infestans
  4. Leaf Mold - Passalora fulva
  5. Septoria Leaf Spot - Septoria lycopersici
  6. Spider Mites (Two-spotted) - Tetranychus urticae
  7. Target Spot - Corynespora cassiicola
  8. Yellow Leaf Curl Virus - Begomovirus
  9. Mosaic Virus - Tobacco mosaic virus
  10. Healthy - No disease detected

πŸ”§ Tech Stack

Core AI/ML

  • Deep Learning: PyTorch 2.1.0, TorchVision 0.16.0
  • Computer Vision: OpenCV 4.8.1, PIL (Pillow) 10.0.1
  • Model Architecture: ResNet50 with custom classification head

API & Backend

  • Web Framework: FastAPI 0.104.1 with async support
  • API Documentation: Automatic OpenAPI/Swagger generation
  • CORS Support: Configurable cross-origin resource sharing

AI Explainability

  • Grad-CAM: Gradient-weighted Class Activation Mapping
  • LIME: Local Interpretable Model-agnostic Explanations
  • Custom Visualization: matplotlib, seaborn for result plotting

Data Processing

  • Numerical: NumPy 1.24.3, Pandas 2.0.3
  • Image Processing: Albumentations for augmentation
  • Serialization: JSON, Pickle for model and data handling

πŸ“ Project Structure

diseases_detection_ai/
β”œβ”€β”€ main.py                 # FastAPI application entry point (477 lines)
β”œβ”€β”€ requirements.txt        # Python dependencies and versions
β”œβ”€β”€ README.md              # Comprehensive documentation (405 lines)
β”œβ”€β”€ api/                   # API implementations
β”‚   β”œβ”€β”€ main.py           # Main API server with full features
β”‚   β”œβ”€β”€ main_optimized.py # Memory-optimized API variant
β”‚   β”œβ”€β”€ Dockerfile        # Container configuration for deployment
β”‚   β”œβ”€β”€ requirements.txt  # API-specific dependencies
β”‚   └── __init__.py       # Package initialization
β”œβ”€β”€ src/                   # Core AI modules (10 files)
β”‚   β”œβ”€β”€ model.py          # ResNet50 model architecture (193 lines)
β”‚   β”œβ”€β”€ model_lite.py     # Lightweight model variants for edge deployment
β”‚   β”œβ”€β”€ explain.py        # Grad-CAM visual explanation system
β”‚   β”œβ”€β”€ explain_lite.py   # Optimized explanation for mobile
β”‚   β”œβ”€β”€ explain_new.py    # Latest explanation implementations
β”‚   β”œβ”€β”€ dataset.py        # Data loading, preprocessing, and augmentation
β”‚   β”œβ”€β”€ train.py          # Complete model training pipeline
β”‚   β”œβ”€β”€ evaluate.py       # Model evaluation and metrics calculation
β”‚   β”œβ”€β”€ risk_level.py     # Disease severity assessment algorithms
β”‚   └── __init__.py       # Package initialization
β”œβ”€β”€ models/               # Trained model checkpoints
β”‚   β”œβ”€β”€ crop_disease_v3_model.pth  # Latest model (v3.0) - Primary
β”‚   β”œβ”€β”€ crop_disease_v2_model.pth  # Previous stable version
β”‚   β”œβ”€β”€ crop_disese_v0.pth        # Initial baseline model
β”‚   β”œβ”€β”€ README.txt        # Model information and usage notes
β”‚   └── .gitattributes    # Git LFS configuration for large files
β”œβ”€β”€ knowledge_base/       # Disease information database
β”‚   └── disease_info.json # Comprehensive disease database (552 lines)
β”œβ”€β”€ data/                 # Training and test datasets
β”‚   β”œβ”€β”€ raw/             # Original dataset images
β”‚   └── processed/       # Preprocessed and augmented data
β”œβ”€β”€ notebooks/            # Jupyter analysis and research notebooks
β”œβ”€β”€ outputs/              # Generated visualizations and results
β”œβ”€β”€ tests/               # Comprehensive testing suite
β”‚   β”œβ”€β”€ test_model.py    # Model functionality tests
β”‚   β”œβ”€β”€ test_api.py      # API endpoint testing
β”‚   └── test_explain.py  # Explanation system tests
└── uselessfiles/        # Development artifacts and experimental code

πŸ› οΈ Setup Instructions

System Requirements

  • Python: 3.8+ (tested with 3.9, 3.10, 3.11)
  • GPU: CUDA-compatible GPU recommended (NVIDIA RTX series optimal)
  • Memory: 8GB+ RAM (16GB recommended for training)
  • Storage: 2GB+ free space for models and datasets
  • OS: Windows 10/11, Linux (Ubuntu 18.04+), macOS 10.15+

Installation Steps

  1. Environment Setup:

    # Navigate to project directory
    cd diseases_detection_ai
    
    # Create isolated virtual environment
    python -m venv disease_detection_env
    disease_detection_env\Scripts\activate  # Windows
    # source disease_detection_env/bin/activate  # Linux/Mac
    
  2. Install Dependencies:

    # Install all required packages
    pip install -r requirements.txt
    
    # Verify PyTorch installation with CUDA support
    python -c "import torch; print(f'PyTorch: {torch.__version__}, CUDA: {torch.cuda.is_available()}')"
    
  3. Model Preparation:

    # Models are included in the repository
    # Verify model files exist
    dir models\*.pth
    
  4. Test Installation:

    # Quick functionality test
    python -c "from src.model import CropDiseaseResNet50; print('Installation successful!')"
    

Quick Start Guide

  1. Launch API Server:

    # Start FastAPI development server
    python main.py
    
    # Server will start on http://localhost:8000
    # API documentation available at http://localhost:8000/docs
    
  2. Test Disease Detection:

    # Using PowerShell with Invoke-RestMethod
    $response = Invoke-RestMethod -Uri "http://localhost:8000/predict" -Method Post -InFile "test_image.jpg" -ContentType "multipart/form-data"
    $response | ConvertTo-Json
    
  3. Alternative API Testing:

    # Using curl (if available)
    curl -X POST "http://localhost:8000/predict" -H "accept: application/json" -H "Content-Type: multipart/form-data" -F "file=@test_crop_image.jpg"
    

πŸ”¬ Model Training & Evaluation

Training Dataset Statistics

  • Total Training Samples: 14,440 high-quality crop images
  • Validation Samples: 3,089 images for model validation
  • Test Samples: 3,109 images for final evaluation
  • Image Resolution: Variable (224x224 after preprocessing)
  • Data Augmentation: Rotation, flip, brightness, contrast adjustments
  • Last Training Date: September 9, 2025

Training Configuration

# Training hyperparameters for v3.0 model
{
    "epochs": 50,
    "batch_size": 32,
    "learning_rate": 0.001,
    "optimizer": "Adam",
    "scheduler": "ReduceLROnPlateau",
    "early_stopping": "patience=7",
    "data_augmentation": True
}

Model Performance Metrics

  • Test Accuracy: 90.09% (v3.0)
  • Validation Accuracy: 90.06% (v3.0)
  • Model Size: ~100MB (full model), ~25MB (lite variant)
  • Average Inference Time: <200ms per image on GPU, <800ms on CPU
  • Memory Usage: ~2GB GPU memory (full model), ~500MB (lite model)

Training Commands

# Train new model from scratch
python src\train.py --epochs 50 --batch_size 32 --lr 0.001 --save_best

# Resume training from checkpoint
python src\train.py --resume models\crop_disease_v2_model.pth --epochs 20

# Evaluate existing model
python src\evaluate.py --model_path models\crop_disease_v3_model.pth --test_data data\test

# Generate visual explanations
python src\explain.py --image_path test_images\tomato_blight.jpg --output_dir outputs\

🌐 API Documentation

Core Endpoints

Disease Prediction

POST /predict
Content-Type: multipart/form-data
Parameters:
  - file: image file (JPG, PNG, JPEG)
  - explain: boolean (optional, default: true)
  - confidence_threshold: float (optional, default: 0.7)

Response Example:
{
  "disease": "Tomato___Early_blight",
  "disease_display": "Early Blight",
  "crop": "Tomato",
  "confidence": 0.9456,
  "severity": "High",
  "risk_level": 8.5,
  "symptoms": ["Brown spots with concentric rings", "Yellowing leaves"],
  "treatment": {
    "immediate": ["Remove affected leaves", "Apply fungicide"],
    "preventive": ["Improve air circulation", "Avoid overhead watering"]
  },
  "explanation": {
    "gradcam_regions": "base64_image_data",
    "attention_map": "visualization_data"
  },
  "processing_time": 0.184
}

Batch Prediction

POST /predict/batch
Content-Type: multipart/form-data
Parameters:
  - files: multiple image files
  
Response: Array of prediction objects

Health Check

GET /health
Response: {
  "status": "healthy",
  "model_loaded": true,
  "version": "3.0",
  "gpu_available": true,
  "memory_usage": "1.2GB"
}

Model Information

GET /model/info
Response: {
  "version": "3.0",
  "classes": 15,
  "accuracy": 0.9009,
  "training_date": "2025-09-09",
  "supported_crops": ["Pepper (Bell)", "Potato", "Tomato"]
}

πŸ” Visual Explanation System

Grad-CAM Implementation

Gradient-weighted Class Activation Mapping highlights the most important regions:

from src.explain import CropDiseaseExplainer

# Initialize explainer with trained model
explainer = CropDiseaseExplainer(
    model_path="models/crop_disease_v3_model.pth",
    device="cuda" if torch.cuda.is_available() else "cpu"
)

# Generate explanation for image
explanation = explainer.explain_prediction(
    image_path="test_image.jpg",
    save_path="outputs/explanation.jpg",
    alpha=0.4  # Overlay transparency
)

LIME Integration

Local Interpretable Model-agnostic Explanations for segment-based analysis:

# Generate LIME explanation
lime_explanation = explainer.lime_explanation(
    image_path="test_image.jpg",
    num_samples=1000,
    num_features=100
)

πŸ§ͺ Testing & Quality Assurance

Automated Testing Suite

# Run complete test suite
python -m pytest tests\ -v --cov=src --cov-report=html

# Run specific test categories
python -m pytest tests\test_model.py -v      # Model functionality
python -m pytest tests\test_api.py -v       # API endpoints
python -m pytest tests\test_explain.py -v   # Explanation system

Manual Testing

# Test model loading and inference
python tests\manual_test_model.py

# Test API with sample images
python tests\manual_test_api.py

# Performance benchmarking
python tests\benchmark_inference.py

Integration Testing

# End-to-end API testing
python tests\integration_test.py --host localhost --port 8000

πŸš€ Production Deployment

Docker Deployment

# Build optimized container
docker build -t crop-disease-detection-api .\api

# Run with GPU support
docker run --gpus all -p 8000:8000 crop-disease-detection-api

# Run CPU-only version
docker run -p 8000:8000 -e USE_GPU=false crop-disease-detection-api

Environment Configuration

# Production environment variables
$env:ENVIRONMENT = "production"
$env:MODEL_PATH = "models/crop_disease_v3_model.pth"
$env:CONFIDENCE_THRESHOLD = "0.8"
$env:ENABLE_EXPLANATIONS = "true"
$env:MAX_IMAGE_SIZE = "10MB"

Production Considerations

  • Load Balancing: Use multiple API instances behind load balancer
  • Monitoring: Implement comprehensive logging and metrics
  • Security: Configure proper CORS, rate limiting, and authentication
  • Performance: Use GPU acceleration and model quantization
  • Scalability: Consider serverless deployment for variable workloads

πŸ“ˆ Performance Optimization

Memory Optimization Strategies

# Use lightweight model for resource-constrained environments
from src.model_lite import TinyDiseaseClassifier

model = TinyDiseaseClassifier(num_classes=15)  # ~5MB model size

Speed Optimization

  • Model Quantization: INT8 quantization for 4x speed improvement
  • Batch Processing: Process multiple images simultaneously
  • Async API: Non-blocking request handling
  • Caching: Cache frequent predictions and explanations

Edge Deployment

  • Model Pruning: Remove unnecessary parameters
  • Knowledge Distillation: Train smaller student models
  • ONNX Export: Cross-platform deployment support

🀝 Development Workflow

Contributing Guidelines

  1. Fork Repository: Create personal fork for development
  2. Feature Branch: Create descriptive branch name
  3. Code Standards: Follow PEP 8 and add type hints
  4. Testing: Add comprehensive tests for new features
  5. Documentation: Update README and inline documentation
  6. Pull Request: Submit with detailed description and test results

Code Quality Standards

  • Type Hints: All functions must include type annotations
  • Docstrings: Google-style docstrings for all public methods
  • Testing: Minimum 80% code coverage required
  • Linting: Code must pass flake8 and black formatting

πŸ“„ License & Legal

This project is part of the HackBhoomi2025 agricultural intelligence platform. All rights reserved.

Model Attribution

  • Base ResNet50 architecture from torchvision (BSD License)
  • Training dataset: Publicly available agricultural disease datasets
  • Custom modifications and enhancements: HackBhoomi2025 team

πŸ†˜ Troubleshooting Guide

Common Issues & Solutions

  1. CUDA Out of Memory Error:

    # Solution: Use lighter model or reduce batch size
    $env:USE_LITE_MODEL = "true"
    $env:BATCH_SIZE = "8"
    
  2. Model Loading Errors:

    # Verify model file integrity
    python -c "import torch; torch.load('models/crop_disease_v3_model.pth', map_location='cpu')"
    
  3. Low Prediction Accuracy:

    • Ensure image quality (minimum 224x224 resolution)
    • Verify crop type is supported (Pepper, Potato, Tomato only)
    • Check image format (JPG, PNG supported)
    • Review confidence threshold settings
  4. API Connection Issues:

    # Check if server is running
    Invoke-RestMethod -Uri "http://localhost:8000/health" -Method Get
    
  5. Dependencies Installation Problems:

    # Clean installation
    pip cache purge
    pip install --no-cache-dir -r requirements.txt
    

Performance Troubleshooting

  • Slow Inference: Enable GPU acceleration, use lite model variant
  • High Memory Usage: Reduce batch size, use memory-optimized model
  • API Timeout: Increase request timeout, optimize image preprocessing

Support & Resources

  • Issue Tracking: GitHub Issues for bug reports and feature requests
  • Documentation: Comprehensive API documentation at /docs
  • Community: HackBhoomi2025 development team for technical support

πŸ“Š Project Statistics:

  • Lines of Code: 2,000+ (main application)
  • Model Parameters: 25.6M (ResNet50), 1.2M (Lite variant)
  • Supported Image Formats: JPG, JPEG, PNG
  • API Response Time: <200ms average
  • Model Accuracy: 90.09% (state-of-the-art for agricultural disease detection)

Last Updated: September 2025
Model Version: 3.0
API Version: 2.0.0
Documentation Version: 1.5