Update README.md

e6c1c9c verified 6 days ago

5.35 kB

	---
	license: openrail++
	language:
	- en
	base_model:
	- RomDev2/yolox-tiny
	tags:
	- shapes
	- trashbins
	pipeline_tag: image-classification
	---
	# TSAP — Trash/Sustainability Assessment Platform

	A deep learning computer vision system for automatic detection and classification of waste bins in images. The system localizes bins using YOLOv4-tiny and classifies their properties using a multi-task ResNet34 model.

	## Overview

	TSAP runs two ML pipelines:

	\| Pipeline \| Architecture \| Input Size \| Output \|
	\|---\|---\|---\|---\|
	\| Detection \| YOLOv4-tiny \| 608×608 \| Bounding boxes \|
	\| Classification \| ResNet34 (multi-task) \| 200×200 \| Fullness + Shape \|

	Classification labels:

	- Fullness (5 classes): `Closed`, `Empty`, `Half`, `Full`, `Overflow`, `Open`
	- Shape (8 classes): `Star`, `Circle`, `Arrow`, `Centric`, `Triangle`, `Square`, `Chevron`, `Lightning Bolt`

	## Directory Structure

	```
	tsap/
	├── classification/
	│ ├── model.py # TSAPMultiClassification (ResNet34 dual-head)
	│ ├── dataset.py # Dataset loader with Albumentations augmentation
	│ └── train.py # Training engine (mixed precision, wandb)
	├── detection/
	│ ├── config.py # YOLOv4-tiny hyperparameters and anchors
	│ ├── dataset.py # YOLO dataset loader with mosaic augmentation
	│ └── train_yolo.py # YOLO training engine with COCO evaluation
	├── common/
	│ ├── loss.py # FocalLoss, Yolo_loss, Yolo_loss_general
	│ └── meters.py # AverageMeter, ProgressMeter
	├── utils/
	│ ├── utils.py # Train/test splits, confusion matrices, visualization
	│ ├── detection_utils.py # IoU / GIoU / DIoU / CIoU calculations
	│ ├── annot_classification_map.py # Image-to-label mapping
	│ ├── annot_darknet2torch.py # Darknet annotation format converter
	│ └── cvat_annotation_converter.py # CVAT XML → YOLO format converter
	├── models/
	│ ├── classification_model/ # Trained classification weights + ONNX
	│ ├── detection_model/ # YOLOv4-tiny weights + ONNX
	│ └── training/ # Standalone training scripts for individual heads
	├── main.py # Usage examples
	├── test.py # Inference script
	└── environment.yml # Conda environment
	```

	## Setup

	```bash
	conda env create -f environment.yml
	conda activate <env-name>
	```

	Key dependencies: PyTorch 1.7.1 (CUDA 10.2), TorchVision 0.8.2, OpenCV 3.4.2, Albumentations 0.5.2, ONNX 1.8.0 + ONNXRuntime 1.5.2, Weights & Biases.

	## Data Preparation

	Annotations are created in CVAT and exported as XML. The converter handles bounding box extraction and label generation:

	```bash
	python utils/cvat_annotation_converter.py
	```

	This converts CVAT XML annotations to YOLO format and crops classification samples with their labels.

	## Training

	### Classification

	```python
	from classification.train import Engine
	from classification.model import TSAPMultiClassification

	model = TSAPMultiClassification(pretrained=True)
	engine = Engine(model, train_loader, val_loader, device_ids=[0,1])
	engine.train(epochs=200)
	```

	Hyperparameters: batch size 128, SGD with momentum 0.9, OneCycleLR (lr=3e-3), image size 200×200.

	### Detection

	```python
	from detection.train_yolo import Engine_YOLO

	engine = Engine_YOLO(cfg_path, weights_path, train_loader, val_loader)
	engine.train(max_batches=30000)
	```

	Hyperparameters: batch size 64 (16 subdivisions), learning rate 0.001 with burn-in, image size 608×608.

	## Inference

	```python
	python test.py
	```

	The inference pipeline:
	1. Resize to 200×200 (classification) or 608×608 (detection)
	2. Normalize using dataset mean/std
	3. Run forward pass
	4. Apply sigmoid + argmax; filter predictions below threshold 0.3

	## Models

	Pre-trained weights are in `models/`:

	\| File \| Description \|
	\|---\|---\|
	\| `classification_model/tsap_bin_classifer.pt` \| Multi-task fullness + shape classifier \|
	\| `classification_model/shape_classifier_resnet34.pt` \| Shape-only classifier \|
	\| `classification_model/TSAP_classifier_dynamic.onnx` \| ONNX (dynamic batch) \|
	\| `detection_model/tsap-detection.weights` \| Darknet format detection weights \|
	\| `detection_model/tsap-detection.cfg` \| Darknet config \|
	\| `detection_model/TSAP_detection_dynamic.onnx` \| ONNX (dynamic batch) \|

	## Architecture

	### Classification — `TSAPMultiClassification`

	```
	Input (200×200)
	└── ResNet34 backbone (shared)
	├── Conv head → Linear(128) → Fullness logits (5)
	└── Conv head → Linear(128) → Shape logits (8)
	```

	Training uses `BCEWithLogitsLoss` for both heads simultaneously.

	### Detection — YOLOv4-tiny

	- Input: 608×608, 6 object classes
	- 2 detection scales (32× and 16× stride), 3 anchors each
	- Losses: XY, WH, objectness, class (IoU-based anchor assignment)

	## Experiment Tracking

	Training integrates with [Weights & Biases](https://wandb.ai) for metric logging and visualization. Set your API key before training:

	```bash
	wandb login
	```