Spaces:

CyberAl
/

Image_Classification_Parfait_TOLEFO

Sleeping

App Files Files Community

Image_Classification_Parfait_TOLEFO / README.md

cyberai-1

updat file

7902c8d about 1 month ago

preview code

raw

history blame contribute delete

12 kB

metadata

title: Computer Vison | Image Classification
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false

Intel Scene Classifier — Parfait TOLEFO

CNN-based image classification · 6 scene categories · PyTorch & TensorFlow

Intel Scene Classifier — Parfait TOLEFO

1. Project Overview

This project implements a complete image classification pipeline for the Intel Image Classification dataset. It includes:

Two independent CNN models: one in PyTorch, one in TensorFlow/Keras
A unified CLI entry point (main.py) with --mode train and --mode eval
A Flask web application with file upload and URL-based image loading
A professional green/black UI with real-time probability bars

Classes (6 categories): buildings · forest · glacier · mountain · sea · street

2. Dataset

Property	Value
Source	Kaggle — Intel Image Classification
Images	~25,000 RGB images (150×150 px)
Train split	~14,000 images (seg_train)
Test split	~3,000 images (seg_test)
Prediction	~7,000 images (seg_pred — unlabeled)
Format	JPEG, organized in class-named subdirectories

Expected folder structure after download:

data/
├── seg_train/
│   └── seg_train/
│       ├── buildings/
│       ├── forest/
│       ├── glacier/
│       ├── mountain/
│       ├── sea/
│       └── street/
├── seg_test/
│   └── seg_test/
│       └── (same 6 subdirectories)
└── seg_pred/
    └── seg_pred/
        └── (unlabeled images)

3. Project Architecture

project/
├── app.py                  ← Flask web server (inference via file or URL)
├── main.py                 ← Unified CLI: train + eval
├── models/
│   ├── __init__.py         ← Exports CNN_Torch, build_cnn_tf, Trainer
│   ├── cnn.py              ← CNN architectures (PyTorch + TensorFlow)
│   └── train.py            ← Trainer class (PyTorch only)
├── utils/
│   ├── __init__.py         ← Exports all preprocessing functions
│   └── prep.py             ← Transforms, DataLoaders, inference preprocessing
├── templates/
│   └── index.html          ← Web UI (green/black terminal aesthetic)
├── parfait_model.pth       ← Trained PyTorch weights (after training)
├── parfait_model.keras     ← Trained TensorFlow weights (after training)
├── requirements.txt
└── README.md

4. Model Architecture

4.1 TensorFlow / Keras model

Input: (228, 228, 3)

Block 1: Conv2D(32, 5×5, ReLU) → MaxPool(2×2)         → 224×224×32 → 112×112×32
Block 2: Conv2D(32, 5×5, ReLU) → MaxPool(2×2)         → 108×108×32 → 54×54×32
Block 3: Conv2D(32, 3×3, ReLU) → MaxPool(2×2)         → 52×52×32  → 26×26×32
Block 4: Conv2D(64, 3×3, ReLU) → MaxPool(2×2)         → 24×24×64  → 12×12×64
Block 5: Conv2D(64, 3×3, ReLU) → MaxPool(2×2)         → 10×10×64  → 5×5×64

Flatten                                                  → 1600
Dense(1024, ReLU)
Dropout(0.20)
Dense(124, ReLU)
Dropout(0.20)
Dense(6, Softmax)

Trainable parameters : 1,86M 
Input size           : 228 × 228 × 3 (RGB)

4.1 PyTorch model

Input: (B, 3, 150, 150)

Block 1:
  Conv2d(3 → 32, 3×3, padding=1) 
  BatchNorm2d(32)
  ReLU
  Conv2d(32 → 32, 3×3, padding=1)
  BatchNorm2d(32)
  ReLU
  MaxPool2d(2)

Block 2:
  Conv2d(32 → 64, 3×3, padding=1)
  BatchNorm2d(64)
  ReLU
  Conv2d(64 → 64, 3×3, padding=1)
  BatchNorm2d(64)
  ReLU
  MaxPool2d(2)
  Dropout2d(0.10)

Block 3:
  Conv2d(64 → 128, 3×3, padding=1)
  BatchNorm2d(128)
  ReLU
  Conv2d(128 → 128, 3×3, padding=1)
  BatchNorm2d(128)
  ReLU
  MaxPool2d(2)
  Dropout2d(0.15)

Block 4:
  Conv2d(128 → 256, 3×3, padding=1)
  BatchNorm2d(256)
  ReLU
  Conv2d(256 → 256, 3×3, padding=1)
  BatchNorm2d(256)
  ReLU
  MaxPool2d(2)
  Dropout2d(0.20)

AdaptiveAvgPool2d(1)     → (B, 256, 1, 1)
Flatten                  → (B, 256)
Linear(256 → 256)
ReLU
Dropout(0.30)
Linear(256 → 6)


Trainable parameters :  1.24M 
Input size           : 150 × 150 × 3 (RGB)

Training configuration:

Parameter	Value
Optimizer	Adam
Learning rate	1e-4
LR scheduler	ReduceLROnPlateau (factor=0.5, patience=3)
Early stopping	patience=5
Batch size	32
Max epochs	50
Loss function	CrossEntropyLoss / SparseCategoricalCrossentropy

5. Dependencies & Installation

Python 3.9+ is required.

# Install dependencies
pip install -r requirements.txt

requirements.txt:

torch>=2.0.0
torchvision>=0.15.0
tensorflow>=2.13.0
flask>=3.0.0
pillow>=10.0.0
numpy>=1.24.0
matplotlib>=3.7.0
tqdm>=4.65.0
scikit-learn>=1.3.0
gunicorn>=21.0.0

6. Usage

6.1 Training

# Train with PyTorch (saves → parfait_model.pth)
python main.py --model pytorch --mode train

# Train with TensorFlow (saves → parfait_model.keras)
python main.py --model tensorflow --mode train

# Full example with all options
python main.py \
    --model      pytorch \
    --mode       train \
    --data_dir   ./data \
    --output_dir ./outputs \
    --epochs     50 \
    --batch_size 32 \
    --lr         1e-4 \
    --patience   15

All CLI arguments:

Argument	Default	Description
`--model`	(required)	`pytorch` or `tensorflow`
`--mode`	(required)	`train` or `eval`
`--data_dir`	`/kaggle/input/.../intel-image-...`	Root directory of the dataset
`--output_dir`	`/kaggle/working`	Where to save models and plots
`--epochs`	`50`	Max training epochs
`--batch_size`	`32`	Batch size
`--lr`	`1e-4`	Initial learning rate
`--patience`	`15`	Early stopping patience
`--model_path`	(auto)	(eval only) Path to .pth or .keras

Training outputs:

outputs/
├── parfait_model.pth          ← Best PyTorch weights
├── parfait_model.keras        ← Best TensorFlow weights
├── history_pytorch.png        ← Train/Val Loss & Accuracy curves
└── history_tf.png

6.2 Evaluation

The eval mode loads a saved model and produces a full diagnostic report:

Global accuracy & loss
Per-class accuracy
Precision / Recall / F1-score (classification report)
Confusion matrix (saved as PNG)
4×4 grid of sample predictions (color-coded: green=correct, red=wrong)

# Evaluate PyTorch model
python main.py \
    --model      pytorch \
    --mode       eval \
    --model_path parfait_model.pth \
    --data_dir   ../data \
    --output_dir ./outputs

# Evaluate TensorFlow model
python main.py \
    --model      tensorflow \
    --mode       eval \
    --model_path parfait_model.keras \
    --data_dir   ../data \
    --output_dir ./outputs

Evaluation outputs:

outputs/
├── confusion_matrix_pytorch.png      ← Confusion matrix heatmap
├── confusion_matrix_tf.png
├── sample_predictions_pytorch.png    ← 16-image prediction grid
└── sample_predictions_tf.png

6.3 Web Application

# Start Flask server
gunicorn app:app --bind 0.0.0.0:8000 --workers 1 --timeout 120

Live link

For instance the app is available at: https://huggingface.co/spaces/CyberAl/Image_Classification_Parfait_TOLEFO

Features:

Model selector: PyTorch or TensorFlow
Input: file upload (drag & drop) or image URL
Output: predicted class + confidence score + probability bars for all 6 classes
Animated plexus background with terminal green/black aesthetic

7. Performance

Results on the Intel Image Classification test set (3,000 images). Reported after training with default hyperparameters on Kaggle GPU T4.

Model	Test Accuracy	Test Loss
PyTorch CNN	~89–91%	~0.30
TF/Keras CNN	~88–90%	~0.32

Per-class performance (approximate):

Class	Precision	Recall	F1-score
buildings	0.87	0.85	0.86
forest	0.97	0.97	0.97
glacier	0.88	0.86	0.87
mountain	0.84	0.87	0.85
sea	0.92	0.93	0.92
street	0.90	0.91	0.90

Note: buildings vs street is the hardest pair due to visual overlap.

8. Preprocessing & Augmentation

All preprocessing is centralized in utils/prep.py.

Training augmentation pipeline (PyTorch)

Resize(150×150)
RandomHorizontalFlip(p=0.5)
RandomVerticalFlip(p=0.1)
RandomRotation(±40°)
ColorJitter(brightness=0.3, contrast=0.2, saturation=0.1, hue=0.05)
RandomGrayscale(p=0.05)         ← forces texture learning over color
ToTensor()
Normalize(mean=[0.485,0.456,0.406], std=[0.229,0.224,0.225])   ← ImageNet stats
RandomErasing(p=0.15, scale=[0.02,0.15])    ← occlusion simulation

Validation / inference (no augmentation)

Resize(150×150)
ToTensor()
Normalize(mean=[0.485,0.456,0.406], std=[0.229,0.224,0.225])

Why ImageNet normalization?

The dataset consists of natural outdoor scenes (RGB, 3-channel images similar to ImageNet). Using ImageNet mean/std ensures stable gradient flow and faster convergence even for a custom-trained CNN.

9. Reproducibility (Seed)

The project uses a global seed (SEED=42) to ensure identical results between runs and between training and production inference.

The seed fixes:

Python random module
NumPy RNG
PyTorch CPU and GPU (torch.manual_seed, torch.cuda.manual_seed_all)
cudnn.deterministic=True, cudnn.benchmark=False
TensorFlow RNG (tf.random.set_seed)
PYTHONHASHSEED environment variable
DataLoader worker seeds (via worker_init_fn)