crop_ai_diseases / README.md
vivek12coder's picture
Upload 20960 files
c8df794 verified
---
title: Crop Disease Detection AI
emoji: 🌱
colorFrom: green
colorTo: yellow
sdk: docker
app_port: 7860
python_version: 3.10
suggested_hardware: cpu-basic
suggested_storage: small
license: apache-2.0
tags:
- computer-vision
- agriculture
- disease-detection
- fastapi
- pytorch
- gradcam
- ai
- deep-learning
- crop-monitoring
---
# Crop Disease Detection AI πŸŒ±πŸ”
> **Advanced Computer Vision System for Agricultural Disease Detection**
This folder contains a state-of-the-art PyTorch-based deep learning system for detecting diseases in crop images using ResNet50 architecture with comprehensive visual explanations and real-time risk assessment.
## πŸš€ Key Features
- **Multi-Crop Disease Detection**: Supports Pepper (Bell), Potato, and Tomato crops
- **15 Disease Classes**: Comprehensive coverage of common agricultural diseases
- **Visual AI Explanations**: Grad-CAM and LIME explanations for prediction transparency
- **FastAPI Backend**: High-performance RESTful API with real-time predictions
- **High Accuracy**: 90.09% test accuracy on validation dataset (v3.0 model)
- **Risk Assessment**: Automated severity scoring and treatment recommendations
- **Memory Optimized**: Multiple model variants for different deployment scenarios
- **Production Ready**: Docker support, comprehensive testing, and monitoring
## 🧠 AI Model Architecture
### Core Model: Enhanced ResNet50
- **Base Architecture**: Pre-trained ResNet50 on ImageNet with custom classifier head
- **Fine-tuning**: Specialized transfer learning for agricultural disease detection
- **Input Specifications**: 224x224 RGB images, normalized with ImageNet statistics
- **Output**: 15-class disease classification with confidence scores
- **Model Depth**: 50 layers with residual connections for stable training
### Advanced Architecture Details
```
ResNet50 Feature Extractor (frozen/unfrozen)
β”œβ”€β”€ Custom Classifier Head:
β”‚ β”œβ”€β”€ Dropout(0.5)
β”‚ β”œβ”€β”€ Linear(2048 β†’ 1024) + BatchNorm + ReLU
β”‚ β”œβ”€β”€ Dropout(0.3)
β”‚ β”œβ”€β”€ Linear(1024 β†’ 512) + BatchNorm + ReLU
β”‚ β”œβ”€β”€ Dropout(0.2)
β”‚ └── Linear(512 β†’ 15) [Output Layer]
```
### Model Versions & Performance
- **v3.0** (Current): Retrained ResNet50 - 90.09% test accuracy
- **v2.0**: Enhanced feature extraction - 87.5% accuracy
- **v1.0**: Initial baseline model - 85.2% accuracy
- **Lite Variants**: Memory-optimized models for edge deployment
## πŸ“Š Supported Disease Classes
### Pepper (Bell) - 2 Classes
1. **Bacterial Spot** - Xanthomonas infection
2. **Healthy** - No disease detected
### Potato - 3 Classes
1. **Early Blight** - Alternaria solani
2. **Late Blight** - Phytophthora infestans
3. **Healthy** - No disease detected
### Tomato - 10 Classes
1. **Bacterial Spot** - Xanthomonas perforans
2. **Early Blight** - Alternaria solani
3. **Late Blight** - Phytophthora infestans
4. **Leaf Mold** - Passalora fulva
5. **Septoria Leaf Spot** - Septoria lycopersici
6. **Spider Mites (Two-spotted)** - Tetranychus urticae
7. **Target Spot** - Corynespora cassiicola
8. **Yellow Leaf Curl Virus** - Begomovirus
9. **Mosaic Virus** - Tobacco mosaic virus
10. **Healthy** - No disease detected
## πŸ”§ Tech Stack
### Core AI/ML
- **Deep Learning**: PyTorch 2.1.0, TorchVision 0.16.0
- **Computer Vision**: OpenCV 4.8.1, PIL (Pillow) 10.0.1
- **Model Architecture**: ResNet50 with custom classification head
### API & Backend
- **Web Framework**: FastAPI 0.104.1 with async support
- **API Documentation**: Automatic OpenAPI/Swagger generation
- **CORS Support**: Configurable cross-origin resource sharing
### AI Explainability
- **Grad-CAM**: Gradient-weighted Class Activation Mapping
- **LIME**: Local Interpretable Model-agnostic Explanations
- **Custom Visualization**: matplotlib, seaborn for result plotting
### Data Processing
- **Numerical**: NumPy 1.24.3, Pandas 2.0.3
- **Image Processing**: Albumentations for augmentation
- **Serialization**: JSON, Pickle for model and data handling
## πŸ“ Project Structure
```
diseases_detection_ai/
β”œβ”€β”€ main.py # FastAPI application entry point (477 lines)
β”œβ”€β”€ requirements.txt # Python dependencies and versions
β”œβ”€β”€ README.md # Comprehensive documentation (405 lines)
β”œβ”€β”€ api/ # API implementations
β”‚ β”œβ”€β”€ main.py # Main API server with full features
β”‚ β”œβ”€β”€ main_optimized.py # Memory-optimized API variant
β”‚ β”œβ”€β”€ Dockerfile # Container configuration for deployment
β”‚ β”œβ”€β”€ requirements.txt # API-specific dependencies
β”‚ └── __init__.py # Package initialization
β”œβ”€β”€ src/ # Core AI modules (10 files)
β”‚ β”œβ”€β”€ model.py # ResNet50 model architecture (193 lines)
β”‚ β”œβ”€β”€ model_lite.py # Lightweight model variants for edge deployment
β”‚ β”œβ”€β”€ explain.py # Grad-CAM visual explanation system
β”‚ β”œβ”€β”€ explain_lite.py # Optimized explanation for mobile
β”‚ β”œβ”€β”€ explain_new.py # Latest explanation implementations
β”‚ β”œβ”€β”€ dataset.py # Data loading, preprocessing, and augmentation
β”‚ β”œβ”€β”€ train.py # Complete model training pipeline
β”‚ β”œβ”€β”€ evaluate.py # Model evaluation and metrics calculation
β”‚ β”œβ”€β”€ risk_level.py # Disease severity assessment algorithms
β”‚ └── __init__.py # Package initialization
β”œβ”€β”€ models/ # Trained model checkpoints
β”‚ β”œβ”€β”€ crop_disease_v3_model.pth # Latest model (v3.0) - Primary
β”‚ β”œβ”€β”€ crop_disease_v2_model.pth # Previous stable version
β”‚ β”œβ”€β”€ crop_disese_v0.pth # Initial baseline model
β”‚ β”œβ”€β”€ README.txt # Model information and usage notes
β”‚ └── .gitattributes # Git LFS configuration for large files
β”œβ”€β”€ knowledge_base/ # Disease information database
β”‚ └── disease_info.json # Comprehensive disease database (552 lines)
β”œβ”€β”€ data/ # Training and test datasets
β”‚ β”œβ”€β”€ raw/ # Original dataset images
β”‚ └── processed/ # Preprocessed and augmented data
β”œβ”€β”€ notebooks/ # Jupyter analysis and research notebooks
β”œβ”€β”€ outputs/ # Generated visualizations and results
β”œβ”€β”€ tests/ # Comprehensive testing suite
β”‚ β”œβ”€β”€ test_model.py # Model functionality tests
β”‚ β”œβ”€β”€ test_api.py # API endpoint testing
β”‚ └── test_explain.py # Explanation system tests
└── uselessfiles/ # Development artifacts and experimental code
```
## πŸ› οΈ Setup Instructions
### System Requirements
- **Python**: 3.8+ (tested with 3.9, 3.10, 3.11)
- **GPU**: CUDA-compatible GPU recommended (NVIDIA RTX series optimal)
- **Memory**: 8GB+ RAM (16GB recommended for training)
- **Storage**: 2GB+ free space for models and datasets
- **OS**: Windows 10/11, Linux (Ubuntu 18.04+), macOS 10.15+
### Installation Steps
1. **Environment Setup**:
```powershell
# Navigate to project directory
cd diseases_detection_ai
# Create isolated virtual environment
python -m venv disease_detection_env
disease_detection_env\Scripts\activate # Windows
# source disease_detection_env/bin/activate # Linux/Mac
```
2. **Install Dependencies**:
```powershell
# Install all required packages
pip install -r requirements.txt
# Verify PyTorch installation with CUDA support
python -c "import torch; print(f'PyTorch: {torch.__version__}, CUDA: {torch.cuda.is_available()}')"
```
3. **Model Preparation**:
```powershell
# Models are included in the repository
# Verify model files exist
dir models\*.pth
```
4. **Test Installation**:
```powershell
# Quick functionality test
python -c "from src.model import CropDiseaseResNet50; print('Installation successful!')"
```
### Quick Start Guide
1. **Launch API Server**:
```powershell
# Start FastAPI development server
python main.py
# Server will start on http://localhost:8000
# API documentation available at http://localhost:8000/docs
```
2. **Test Disease Detection**:
```powershell
# Using PowerShell with Invoke-RestMethod
$response = Invoke-RestMethod -Uri "http://localhost:8000/predict" -Method Post -InFile "test_image.jpg" -ContentType "multipart/form-data"
$response | ConvertTo-Json
```
3. **Alternative API Testing**:
```powershell
# Using curl (if available)
curl -X POST "http://localhost:8000/predict" -H "accept: application/json" -H "Content-Type: multipart/form-data" -F "file=@test_crop_image.jpg"
```
## πŸ”¬ Model Training & Evaluation
### Training Dataset Statistics
- **Total Training Samples**: 14,440 high-quality crop images
- **Validation Samples**: 3,089 images for model validation
- **Test Samples**: 3,109 images for final evaluation
- **Image Resolution**: Variable (224x224 after preprocessing)
- **Data Augmentation**: Rotation, flip, brightness, contrast adjustments
- **Last Training Date**: September 9, 2025
### Training Configuration
```python
# Training hyperparameters for v3.0 model
{
"epochs": 50,
"batch_size": 32,
"learning_rate": 0.001,
"optimizer": "Adam",
"scheduler": "ReduceLROnPlateau",
"early_stopping": "patience=7",
"data_augmentation": True
}
```
### Model Performance Metrics
- **Test Accuracy**: 90.09% (v3.0)
- **Validation Accuracy**: 90.06% (v3.0)
- **Model Size**: ~100MB (full model), ~25MB (lite variant)
- **Average Inference Time**: <200ms per image on GPU, <800ms on CPU
- **Memory Usage**: ~2GB GPU memory (full model), ~500MB (lite model)
### Training Commands
```powershell
# Train new model from scratch
python src\train.py --epochs 50 --batch_size 32 --lr 0.001 --save_best
# Resume training from checkpoint
python src\train.py --resume models\crop_disease_v2_model.pth --epochs 20
# Evaluate existing model
python src\evaluate.py --model_path models\crop_disease_v3_model.pth --test_data data\test
# Generate visual explanations
python src\explain.py --image_path test_images\tomato_blight.jpg --output_dir outputs\
```
## 🌐 API Documentation
### Core Endpoints
#### Disease Prediction
```http
POST /predict
Content-Type: multipart/form-data
Parameters:
- file: image file (JPG, PNG, JPEG)
- explain: boolean (optional, default: true)
- confidence_threshold: float (optional, default: 0.7)
Response Example:
{
"disease": "Tomato___Early_blight",
"disease_display": "Early Blight",
"crop": "Tomato",
"confidence": 0.9456,
"severity": "High",
"risk_level": 8.5,
"symptoms": ["Brown spots with concentric rings", "Yellowing leaves"],
"treatment": {
"immediate": ["Remove affected leaves", "Apply fungicide"],
"preventive": ["Improve air circulation", "Avoid overhead watering"]
},
"explanation": {
"gradcam_regions": "base64_image_data",
"attention_map": "visualization_data"
},
"processing_time": 0.184
}
```
#### Batch Prediction
```http
POST /predict/batch
Content-Type: multipart/form-data
Parameters:
- files: multiple image files
Response: Array of prediction objects
```
#### Health Check
```http
GET /health
Response: {
"status": "healthy",
"model_loaded": true,
"version": "3.0",
"gpu_available": true,
"memory_usage": "1.2GB"
}
```
#### Model Information
```http
GET /model/info
Response: {
"version": "3.0",
"classes": 15,
"accuracy": 0.9009,
"training_date": "2025-09-09",
"supported_crops": ["Pepper (Bell)", "Potato", "Tomato"]
}
```
## πŸ” Visual Explanation System
### Grad-CAM Implementation
Gradient-weighted Class Activation Mapping highlights the most important regions:
```python
from src.explain import CropDiseaseExplainer
# Initialize explainer with trained model
explainer = CropDiseaseExplainer(
model_path="models/crop_disease_v3_model.pth",
device="cuda" if torch.cuda.is_available() else "cpu"
)
# Generate explanation for image
explanation = explainer.explain_prediction(
image_path="test_image.jpg",
save_path="outputs/explanation.jpg",
alpha=0.4 # Overlay transparency
)
```
### LIME Integration
Local Interpretable Model-agnostic Explanations for segment-based analysis:
```python
# Generate LIME explanation
lime_explanation = explainer.lime_explanation(
image_path="test_image.jpg",
num_samples=1000,
num_features=100
)
```
## πŸ§ͺ Testing & Quality Assurance
### Automated Testing Suite
```powershell
# Run complete test suite
python -m pytest tests\ -v --cov=src --cov-report=html
# Run specific test categories
python -m pytest tests\test_model.py -v # Model functionality
python -m pytest tests\test_api.py -v # API endpoints
python -m pytest tests\test_explain.py -v # Explanation system
```
### Manual Testing
```powershell
# Test model loading and inference
python tests\manual_test_model.py
# Test API with sample images
python tests\manual_test_api.py
# Performance benchmarking
python tests\benchmark_inference.py
```
### Integration Testing
```powershell
# End-to-end API testing
python tests\integration_test.py --host localhost --port 8000
```
## πŸš€ Production Deployment
### Docker Deployment
```powershell
# Build optimized container
docker build -t crop-disease-detection-api .\api
# Run with GPU support
docker run --gpus all -p 8000:8000 crop-disease-detection-api
# Run CPU-only version
docker run -p 8000:8000 -e USE_GPU=false crop-disease-detection-api
```
### Environment Configuration
```powershell
# Production environment variables
$env:ENVIRONMENT = "production"
$env:MODEL_PATH = "models/crop_disease_v3_model.pth"
$env:CONFIDENCE_THRESHOLD = "0.8"
$env:ENABLE_EXPLANATIONS = "true"
$env:MAX_IMAGE_SIZE = "10MB"
```
### Production Considerations
- **Load Balancing**: Use multiple API instances behind load balancer
- **Monitoring**: Implement comprehensive logging and metrics
- **Security**: Configure proper CORS, rate limiting, and authentication
- **Performance**: Use GPU acceleration and model quantization
- **Scalability**: Consider serverless deployment for variable workloads
## πŸ“ˆ Performance Optimization
### Memory Optimization Strategies
```python
# Use lightweight model for resource-constrained environments
from src.model_lite import TinyDiseaseClassifier
model = TinyDiseaseClassifier(num_classes=15) # ~5MB model size
```
### Speed Optimization
- **Model Quantization**: INT8 quantization for 4x speed improvement
- **Batch Processing**: Process multiple images simultaneously
- **Async API**: Non-blocking request handling
- **Caching**: Cache frequent predictions and explanations
### Edge Deployment
- **Model Pruning**: Remove unnecessary parameters
- **Knowledge Distillation**: Train smaller student models
- **ONNX Export**: Cross-platform deployment support
## 🀝 Development Workflow
### Contributing Guidelines
1. **Fork Repository**: Create personal fork for development
2. **Feature Branch**: Create descriptive branch name
3. **Code Standards**: Follow PEP 8 and add type hints
4. **Testing**: Add comprehensive tests for new features
5. **Documentation**: Update README and inline documentation
6. **Pull Request**: Submit with detailed description and test results
### Code Quality Standards
- **Type Hints**: All functions must include type annotations
- **Docstrings**: Google-style docstrings for all public methods
- **Testing**: Minimum 80% code coverage required
- **Linting**: Code must pass flake8 and black formatting
## πŸ“„ License & Legal
This project is part of the HackBhoomi2025 agricultural intelligence platform. All rights reserved.
### Model Attribution
- Base ResNet50 architecture from torchvision (BSD License)
- Training dataset: Publicly available agricultural disease datasets
- Custom modifications and enhancements: HackBhoomi2025 team
## πŸ†˜ Troubleshooting Guide
### Common Issues & Solutions
1. **CUDA Out of Memory Error**:
```powershell
# Solution: Use lighter model or reduce batch size
$env:USE_LITE_MODEL = "true"
$env:BATCH_SIZE = "8"
```
2. **Model Loading Errors**:
```powershell
# Verify model file integrity
python -c "import torch; torch.load('models/crop_disease_v3_model.pth', map_location='cpu')"
```
3. **Low Prediction Accuracy**:
- Ensure image quality (minimum 224x224 resolution)
- Verify crop type is supported (Pepper, Potato, Tomato only)
- Check image format (JPG, PNG supported)
- Review confidence threshold settings
4. **API Connection Issues**:
```powershell
# Check if server is running
Invoke-RestMethod -Uri "http://localhost:8000/health" -Method Get
```
5. **Dependencies Installation Problems**:
```powershell
# Clean installation
pip cache purge
pip install --no-cache-dir -r requirements.txt
```
### Performance Troubleshooting
- **Slow Inference**: Enable GPU acceleration, use lite model variant
- **High Memory Usage**: Reduce batch size, use memory-optimized model
- **API Timeout**: Increase request timeout, optimize image preprocessing
### Support & Resources
- **Issue Tracking**: GitHub Issues for bug reports and feature requests
- **Documentation**: Comprehensive API documentation at `/docs`
- **Community**: HackBhoomi2025 development team for technical support
---
**πŸ“Š Project Statistics:**
- **Lines of Code**: 2,000+ (main application)
- **Model Parameters**: 25.6M (ResNet50), 1.2M (Lite variant)
- **Supported Image Formats**: JPG, JPEG, PNG
- **API Response Time**: <200ms average
- **Model Accuracy**: 90.09% (state-of-the-art for agricultural disease detection)
*Last Updated: September 2025*
*Model Version: 3.0*
*API Version: 2.0.0*
*Documentation Version: 1.5*