WasteManagement / README.md
Sharath33's picture
Update README.md
e6c1c9c verified
---
license: openrail++
language:
- en
base_model:
- RomDev2/yolox-tiny
tags:
- shapes
- trashbins
pipeline_tag: image-classification
---
# TSAP β€” Trash/Sustainability Assessment Platform
A deep learning computer vision system for automatic detection and classification of waste bins in images. The system localizes bins using YOLOv4-tiny and classifies their properties using a multi-task ResNet34 model.
## Overview
TSAP runs two ML pipelines:
| Pipeline | Architecture | Input Size | Output |
|---|---|---|---|
| Detection | YOLOv4-tiny | 608Γ—608 | Bounding boxes |
| Classification | ResNet34 (multi-task) | 200Γ—200 | Fullness + Shape |
**Classification labels:**
- **Fullness** (5 classes): `Closed`, `Empty`, `Half`, `Full`, `Overflow`, `Open`
- **Shape** (8 classes): `Star`, `Circle`, `Arrow`, `Centric`, `Triangle`, `Square`, `Chevron`, `Lightning Bolt`
## Directory Structure
```
tsap/
β”œβ”€β”€ classification/
β”‚ β”œβ”€β”€ model.py # TSAPMultiClassification (ResNet34 dual-head)
β”‚ β”œβ”€β”€ dataset.py # Dataset loader with Albumentations augmentation
β”‚ └── train.py # Training engine (mixed precision, wandb)
β”œβ”€β”€ detection/
β”‚ β”œβ”€β”€ config.py # YOLOv4-tiny hyperparameters and anchors
β”‚ β”œβ”€β”€ dataset.py # YOLO dataset loader with mosaic augmentation
β”‚ └── train_yolo.py # YOLO training engine with COCO evaluation
β”œβ”€β”€ common/
β”‚ β”œβ”€β”€ loss.py # FocalLoss, Yolo_loss, Yolo_loss_general
β”‚ └── meters.py # AverageMeter, ProgressMeter
β”œβ”€β”€ utils/
β”‚ β”œβ”€β”€ utils.py # Train/test splits, confusion matrices, visualization
β”‚ β”œβ”€β”€ detection_utils.py # IoU / GIoU / DIoU / CIoU calculations
β”‚ β”œβ”€β”€ annot_classification_map.py # Image-to-label mapping
β”‚ β”œβ”€β”€ annot_darknet2torch.py # Darknet annotation format converter
β”‚ └── cvat_annotation_converter.py # CVAT XML β†’ YOLO format converter
β”œβ”€β”€ models/
β”‚ β”œβ”€β”€ classification_model/ # Trained classification weights + ONNX
β”‚ β”œβ”€β”€ detection_model/ # YOLOv4-tiny weights + ONNX
β”‚ └── training/ # Standalone training scripts for individual heads
β”œβ”€β”€ main.py # Usage examples
β”œβ”€β”€ test.py # Inference script
└── environment.yml # Conda environment
```
## Setup
```bash
conda env create -f environment.yml
conda activate <env-name>
```
Key dependencies: PyTorch 1.7.1 (CUDA 10.2), TorchVision 0.8.2, OpenCV 3.4.2, Albumentations 0.5.2, ONNX 1.8.0 + ONNXRuntime 1.5.2, Weights & Biases.
## Data Preparation
Annotations are created in CVAT and exported as XML. The converter handles bounding box extraction and label generation:
```bash
python utils/cvat_annotation_converter.py
```
This converts CVAT XML annotations to YOLO format and crops classification samples with their labels.
## Training
### Classification
```python
from classification.train import Engine
from classification.model import TSAPMultiClassification
model = TSAPMultiClassification(pretrained=True)
engine = Engine(model, train_loader, val_loader, device_ids=[0,1])
engine.train(epochs=200)
```
Hyperparameters: batch size 128, SGD with momentum 0.9, OneCycleLR (lr=3e-3), image size 200Γ—200.
### Detection
```python
from detection.train_yolo import Engine_YOLO
engine = Engine_YOLO(cfg_path, weights_path, train_loader, val_loader)
engine.train(max_batches=30000)
```
Hyperparameters: batch size 64 (16 subdivisions), learning rate 0.001 with burn-in, image size 608Γ—608.
## Inference
```python
python test.py
```
The inference pipeline:
1. Resize to 200Γ—200 (classification) or 608Γ—608 (detection)
2. Normalize using dataset mean/std
3. Run forward pass
4. Apply sigmoid + argmax; filter predictions below threshold 0.3
## Models
Pre-trained weights are in `models/`:
| File | Description |
|---|---|
| `classification_model/tsap_bin_classifer.pt` | Multi-task fullness + shape classifier |
| `classification_model/shape_classifier_resnet34.pt` | Shape-only classifier |
| `classification_model/TSAP_classifier_dynamic.onnx` | ONNX (dynamic batch) |
| `detection_model/tsap-detection.weights` | Darknet format detection weights |
| `detection_model/tsap-detection.cfg` | Darknet config |
| `detection_model/TSAP_detection_dynamic.onnx` | ONNX (dynamic batch) |
## Architecture
### Classification β€” `TSAPMultiClassification`
```
Input (200Γ—200)
└── ResNet34 backbone (shared)
β”œβ”€β”€ Conv head β†’ Linear(128) β†’ Fullness logits (5)
└── Conv head β†’ Linear(128) β†’ Shape logits (8)
```
Training uses `BCEWithLogitsLoss` for both heads simultaneously.
### Detection β€” YOLOv4-tiny
- Input: 608Γ—608, 6 object classes
- 2 detection scales (32Γ— and 16Γ— stride), 3 anchors each
- Losses: XY, WH, objectness, class (IoU-based anchor assignment)
## Experiment Tracking
Training integrates with [Weights & Biases](https://wandb.ai) for metric logging and visualization. Set your API key before training:
```bash
wandb login
```