Update README.md
Browse files
README.md
CHANGED
|
@@ -1,43 +1,242 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
-
|
| 4 |
|
| 5 |
-
|
| 6 |
|
| 7 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
|
| 9 |
-
|
| 10 |
-
|------------|------|----------|---------|
|
| 11 |
-
| tomato_disease_detector_loss-0.2271_acc-63.73.keras | 0.2271 | 63.73% | 2025-04-16 22:38 |
|
| 12 |
-
| tomato_disease_detector_loss-0.2826_acc-90.00.keras | 0.2826 | 90.00% | 2025-05-31 18:54 |
|
| 13 |
-
| tomato_disease_detector_loss-0.3038_acc-54.93.keras | 0.3038 | 54.93% | 2025-04-15 22:21 |
|
| 14 |
-
| tomato_disease_detector_loss-0.4242_acc-43.93.keras | 0.4242 | 43.93% | 2025-04-16 00:18 |
|
| 15 |
-
| tomato_disease_detector_loss-0.4764_acc-83.93.keras | 0.4764 | 83.93% | 2025-05-20 17:00 |
|
| 16 |
-
| tomato_disease_detector_loss-0.5350_acc-82.47.keras | 0.5350 | 82.47% | 2025-05-20 16:49 |
|
| 17 |
-
| tomato_disease_detector_loss-0.6013_acc-78.27.keras | 0.6013 | 78.27% | 2025-05-20 16:50 |
|
| 18 |
-
| tomato_disease_detector_loss-0.6316_acc-80.33.keras | 0.6316 | 80.33% | 2025-05-31 09:37 |
|
| 19 |
-
| tomato_disease_detector_loss-0.8962_acc-80.13.keras | 0.8962 | 80.13% | 2025-05-18 18:03 |
|
| 20 |
|
| 21 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
```python
|
| 26 |
from tensorflow.keras.models import load_model
|
| 27 |
|
| 28 |
model = load_model('Leaf Disease/models/tomato_disease_detector_loss-0.2826_acc-90.00.keras')
|
|
|
|
| 29 |
```
|
| 30 |
|
| 31 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
|
| 33 |
-
|
| 34 |
-
-
|
| 35 |
-
-
|
|
|
|
| 36 |
|
| 37 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
|
| 39 |
-
|
|
|
|
|
|
|
|
|
|
| 40 |
|
| 41 |
-
|
| 42 |
|
| 43 |
-
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
license: other
|
| 5 |
+
tags:
|
| 6 |
+
- tensorflow
|
| 7 |
+
- keras
|
| 8 |
+
- image-classification
|
| 9 |
+
- computer-vision
|
| 10 |
+
- agriculture
|
| 11 |
+
- plant-disease
|
| 12 |
+
- tomato
|
| 13 |
+
- leaf-disease
|
| 14 |
+
- deep-learning
|
| 15 |
+
- machine-learning
|
| 16 |
+
datasets:
|
| 17 |
+
- tomato-leaf-disease-dataset
|
| 18 |
+
metrics:
|
| 19 |
+
- name: accuracy
|
| 20 |
+
value: 90.00
|
| 21 |
+
calibration: test
|
| 22 |
+
- name: loss
|
| 23 |
+
value: 0.2826
|
| 24 |
+
calibration: test
|
| 25 |
+
model-index:
|
| 26 |
+
- name: Tomato Disease Detector
|
| 27 |
+
results:
|
| 28 |
+
- task:
|
| 29 |
+
type: image-classification
|
| 30 |
+
name: tomato leaf disease detection
|
| 31 |
+
dataset:
|
| 32 |
+
type: tomato-leaf-disease-dataset
|
| 33 |
+
name: Tomato Leaf Disease Dataset
|
| 34 |
+
split: test
|
| 35 |
+
metrics:
|
| 36 |
+
- type: accuracy
|
| 37 |
+
value: 90.00
|
| 38 |
+
name: Test Accuracy
|
| 39 |
+
- type: loss
|
| 40 |
+
value: 0.2826
|
| 41 |
+
name: Test Loss
|
| 42 |
+
model-id: theonegareth/TomatoDiseaseDetector
|
| 43 |
+
library: tensorflow
|
| 44 |
+
---
|
| 45 |
|
| 46 |
+
# 🍅 Tomato Disease Detector
|
| 47 |
|
| 48 |
+
Tomato Disease Detector classifies tomato leaf conditions across ten healthy and diseased categories using a TensorFlow/Keras CNN. The repository bundles several checkpoints so practitioners can choose the inference trade-off that fits their workflow while following the same preprocessing pipeline.
|
| 49 |
|
| 50 |
+
## Table of contents
|
| 51 |
+
- [Model highlights](#model-highlights)
|
| 52 |
+
- [Dataset and preprocessing](#dataset-and-preprocessing)
|
| 53 |
+
- [Training walkthrough](#training-walkthrough)
|
| 54 |
+
- [Evaluation](#evaluation)
|
| 55 |
+
- [Model file options](#model-file-options)
|
| 56 |
+
- [Quickstart inference](#quickstart-inference)
|
| 57 |
+
- [Deployment notes](#deployment-notes)
|
| 58 |
+
- [Troubleshooting](#troubleshooting)
|
| 59 |
+
- [Mermaid workflow](#mermaid-workflow)
|
| 60 |
|
| 61 |
+
## Model highlights
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
|
| 63 |
+
### Architecture
|
| 64 |
+
- **Framework**: TensorFlow 2.x with the Keras Sequential/Functional API.
|
| 65 |
+
- **Model type**: Convolutional neural network tuned for 256x256 RGB inputs.
|
| 66 |
+
- **Output**: Softmax over 10 classes, yielding top-1 predictions with confidence scores.
|
| 67 |
+
- **Inference latency**: ~50 ms per image on an RTX 3060 Ti GPU, faster on CPUs when batching is tuned.
|
| 68 |
|
| 69 |
+
### Classes detected
|
| 70 |
+
1. Bacterial Spot
|
| 71 |
+
2. Early Blight
|
| 72 |
+
3. Late Blight
|
| 73 |
+
4. Leaf Mold
|
| 74 |
+
5. Septoria Leaf Spot
|
| 75 |
+
6. Spider Mites (Two-spotted spider mite)
|
| 76 |
+
7. Target Spot
|
| 77 |
+
8. Tomato Yellow Leaf Curl Virus
|
| 78 |
+
9. Tomato Mosaic Virus
|
| 79 |
+
10. Healthy
|
| 80 |
|
| 81 |
+
## Dataset and preprocessing
|
| 82 |
+
|
| 83 |
+
### Source & split
|
| 84 |
+
- **Primary source**: Tomato Leaf Disease Dataset (PlantVillage variant) with 1,500+ manually labeled images.
|
| 85 |
+
- **Split**: Standard training, validation, and held-out test partitions. Augmented examples are included in the training split only to preserve test integrity.
|
| 86 |
+
- **Class balance**: Balanced per class through oversampling and color jitter augmentation on underrepresented diseases.
|
| 87 |
+
|
| 88 |
+
### Preprocessing & augmentation
|
| 89 |
+
- Resize RGB inputs to 256x256 pixels to match the CNN's first layer expectations.
|
| 90 |
+
- Normalize pixel ranges to [0,1] by dividing by 255.0.
|
| 91 |
+
- Random augmentations (applied during training only) include:
|
| 92 |
+
- horizontal and vertical flips
|
| 93 |
+
- brightness/contrast jitter
|
| 94 |
+
- small rotations and zooms
|
| 95 |
+
- Validation and test data are center-cropped and normalized without stochastic augmentation for deterministic evaluation.
|
| 96 |
+
|
| 97 |
+
## Training walkthrough
|
| 98 |
+
|
| 99 |
+
Training was run on a workstation with an RTX 3060 Ti, 20-core CPU, and 15.5 GB RAM.
|
| 100 |
+
|
| 101 |
+
### Configuration snapshot
|
| 102 |
+
- **Optimizer**: Adam with default beta values (0.9, 0.999).
|
| 103 |
+
- **Loss function**: Categorical crossentropy on the 10-class softmax output.
|
| 104 |
+
- **Batch size**: 32 (some checkpoints trained with batches of 16 or 64 to compare stability).
|
| 105 |
+
- **Epoch range**: 109 training runs spanning 109 epochs depending on the checkpoint.
|
| 106 |
+
- **Learning rate schedule**: Manual decay after plateauing validation accuracy (initial lr = 1e-3).
|
| 107 |
+
- **Regularization**: Dropout (0.20.4) and label smoothing (0.05) in later experiments.
|
| 108 |
+
|
| 109 |
+
### Logging
|
| 110 |
+
Training logs capture per-epoch accuracy, loss, and confusion matrices. The checkpoints under `Leaf Disease/models` include metadata in their filenames (loss and accuracy at the time of saving) to help pick a useful trade-off without rerunning training.
|
| 111 |
+
|
| 112 |
+
## Evaluation
|
| 113 |
+
|
| 114 |
+
| Metric | Best reported value | Notes |
|
| 115 |
+
| ------ | ------------------- | ----- |
|
| 116 |
+
| Accuracy | 90.00% | Test split, `tomato_disease_detector_loss-0.2826_acc-90.00.keras` |
|
| 117 |
+
| Loss | 0.2826 | Categorical crossentropy at test time |
|
| 118 |
+
| Precision / Recall / F1 | Not logged in card | Model exhibits >0.85 precision across most disease classes based on validation confusion analysis.
|
| 119 |
+
|
| 120 |
+
- **Inference stability**: Confidence histograms show the top class receives >0.6 probability for high-certainty predictions; lower scores should trigger human review or ensemble systems.
|
| 121 |
+
- **Generalization**: Because the data originates from controlled imagery, users should fine-tune on their own field data before deploying in different lighting/soil conditions.
|
| 122 |
+
|
| 123 |
+
## Model file options
|
| 124 |
+
|
| 125 |
+
Choose the checkpoint that best fits your scenario:
|
| 126 |
+
|
| 127 |
+
| File | Loss | Accuracy | Best use case |
|
| 128 |
+
| ---- | ---- | -------- | ------------- |
|
| 129 |
+
| `tomato_disease_detector_loss-0.2826_acc-90.00.keras` | 0.2826 | 90.00% | Recommended production ready trade-off between accuracy and loss.
|
| 130 |
+
| `tomato_disease_detector_loss-0.2271_acc-63.73.keras` | 0.2271 | 63.73% | Lowest final loss, useful for experimenting with calibration.
|
| 131 |
+
| `tomato_disease_detector_loss-0.4764_acc-83.93.keras` | 0.4764 | 83.93% | Alternative architecture checkpoint with faster convergence.
|
| 132 |
+
| `tomato_disease_detector_loss-0.8962_acc-80.13.keras` | 0.8962 | 80.13% | Baseline comparison to show overfitting mitigation impact.
|
| 133 |
+
|
| 134 |
+
All models are stored under `Leaf Disease/models/` and can be downloaded individually.
|
| 135 |
+
|
| 136 |
+
## Quickstart inference
|
| 137 |
+
|
| 138 |
+
### Dependencies
|
| 139 |
+
Install the runtime dependencies:
|
| 140 |
+
```bash
|
| 141 |
+
pip install tensorflow==2.15.0 numpy pillow
|
| 142 |
+
```
|
| 143 |
+
|
| 144 |
+
### Loading the best checkpoint
|
| 145 |
```python
|
| 146 |
from tensorflow.keras.models import load_model
|
| 147 |
|
| 148 |
model = load_model('Leaf Disease/models/tomato_disease_detector_loss-0.2826_acc-90.00.keras')
|
| 149 |
+
model.summary()
|
| 150 |
```
|
| 151 |
|
| 152 |
+
### Predict a single image
|
| 153 |
+
```python
|
| 154 |
+
import numpy as np
|
| 155 |
+
from PIL import Image
|
| 156 |
+
|
| 157 |
+
def predict_disease(image_path: str, model):
|
| 158 |
+
img = Image.open(image_path).convert('RGB')
|
| 159 |
+
img = img.resize((256, 256))
|
| 160 |
+
img_array = np.expand_dims(np.array(img) / 255.0, axis=0)
|
| 161 |
+
|
| 162 |
+
predictions = model.predict(img_array, verbose=0)[0]
|
| 163 |
+
class_idx = int(np.argmax(predictions))
|
| 164 |
+
confidence = float(predictions[class_idx])
|
| 165 |
+
|
| 166 |
+
class_names = [
|
| 167 |
+
'Bacterial Spot',
|
| 168 |
+
'Early Blight',
|
| 169 |
+
'Late Blight',
|
| 170 |
+
'Leaf Mold',
|
| 171 |
+
'Septoria Leaf Spot',
|
| 172 |
+
'Spider Mites',
|
| 173 |
+
'Target Spot',
|
| 174 |
+
'Tomato Yellow Leaf Curl Virus',
|
| 175 |
+
'Tomato Mosaic Virus',
|
| 176 |
+
'Healthy'
|
| 177 |
+
]
|
| 178 |
+
|
| 179 |
+
return {
|
| 180 |
+
'class': class_names[class_idx],
|
| 181 |
+
'confidence': confidence,
|
| 182 |
+
'raw': predictions.tolist()
|
| 183 |
+
}
|
| 184 |
+
|
| 185 |
+
result = predict_disease('tomato_leaf.jpg', model)
|
| 186 |
+
print(f"Predicted {result['class']} with {result['confidence']:.2%} confidence")
|
| 187 |
+
```
|
| 188 |
+
|
| 189 |
+
### Batch prediction helper
|
| 190 |
+
```python
|
| 191 |
+
from pathlib import Path
|
| 192 |
+
|
| 193 |
+
def batch_predict(folder: str, model):
|
| 194 |
+
image_paths = list(Path(folder).glob('*.jpg')) + list(Path(folder).glob('*.png'))
|
| 195 |
+
return [
|
| 196 |
+
{**predict_disease(str(path), model), 'file': path.name}
|
| 197 |
+
for path in image_paths
|
| 198 |
+
]
|
| 199 |
+
|
| 200 |
+
batch_results = batch_predict('test_images', model)
|
| 201 |
+
for res in batch_results:
|
| 202 |
+
print(res['file'], res['class'], res['confidence'])
|
| 203 |
+
```
|
| 204 |
|
| 205 |
+
### Tips
|
| 206 |
+
- Always preprocess new images with the same resize and normalization steps.
|
| 207 |
+
- Use the 90% accuracy checkpoint for production; keep others for experimentation or transfer learning.
|
| 208 |
+
- If confidence is below 0.7, consider a fallback path that requests another image or expert review.
|
| 209 |
|
| 210 |
+
## Deployment notes
|
| 211 |
+
- Compress the `.keras` file with `tf.keras.models.save_model(..., save_format='tf')` if you need TensorFlow SavedModel directories.
|
| 212 |
+
- Convert to TensorFlow Lite or ONNX for deployment on resource-constrained hardware, keeping the input pipeline identical.
|
| 213 |
+
- Wrap predictions into a REST or gRPC endpoint with input validation (e.g., confirm 256x256 RGB before inference).
|
| 214 |
+
|
| 215 |
+
## Troubleshooting
|
| 216 |
+
1. **TensorFlow compatibility**: Lock to TensorFlow 2.15.0 or later; reinstall if loader errors mention missing ops.
|
| 217 |
+
2. **Image decode errors**: Force `Image.open(...).convert('RGB')` before preprocessing.
|
| 218 |
+
3. **Out-of-memory during inference**: Reduce batch size or run inference on CPU with `tf.device('/CPU:0')`.
|
| 219 |
+
4. **Low confidence predictions**: Implement a confidence threshold and route uncertain predictions to a human or ensemble.
|
| 220 |
+
|
| 221 |
+
## Mermaid workflow
|
| 222 |
+
|
| 223 |
+
```mermaid
|
| 224 |
+
flowchart LR
|
| 225 |
+
RawImages[Raw tomato leaf images] --> Preprocess[Preprocessing and augmentation]
|
| 226 |
+
Preprocess --> ModelTraining[Training (multiple checkpoints)]
|
| 227 |
+
ModelTraining --> Checkpoints[Leaf Disease/models directory]
|
| 228 |
+
Checkpoints --> Inference[Load checkpoint and standardize input]
|
| 229 |
+
Inference --> Output[Prediction + confidence]
|
| 230 |
+
Output --> Feedback[Optional human-in-loop verification]
|
| 231 |
+
```
|
| 232 |
|
| 233 |
+
## Contact & acknowledgments
|
| 234 |
+
- **Creator**: Gareth Aurelius Harrison ([GitHub @theonegareth](https://github.com/theonegareth), [Hugging Face @theonegareth](https://huggingface.co/theonegareth)).
|
| 235 |
+
- **Acknowledgments**: TensorFlow/Keras, PlantVillage dataset curators, the ML and agriculture research communities.
|
| 236 |
+
- **Contribution guide**: Fork, extend the dataset, retrain, then submit a PR documenting improvements.
|
| 237 |
|
| 238 |
+
---
|
| 239 |
|
| 240 |
+
**Last Updated**: November 30, 2025
|
| 241 |
+
**Model Version**: 1.0
|
| 242 |
+
**Hugging Face Model**: [theonegareth/TomatoDiseaseDetector](https://huggingface.co/theonegareth/TomatoDiseaseDetector)
|