You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

TSAP β€” Trash/Sustainability Assessment Platform

A deep learning computer vision system for automatic detection and classification of waste bins in images. The system localizes bins using YOLOv4-tiny and classifies their properties using a multi-task ResNet34 model.

Overview

TSAP runs two ML pipelines:

Pipeline Architecture Input Size Output
Detection YOLOv4-tiny 608Γ—608 Bounding boxes
Classification ResNet34 (multi-task) 200Γ—200 Fullness + Shape

Classification labels:

  • Fullness (5 classes): Closed, Empty, Half, Full, Overflow, Open
  • Shape (8 classes): Star, Circle, Arrow, Centric, Triangle, Square, Chevron, Lightning Bolt

Directory Structure

tsap/
β”œβ”€β”€ classification/
β”‚   β”œβ”€β”€ model.py                        # TSAPMultiClassification (ResNet34 dual-head)
β”‚   β”œβ”€β”€ dataset.py                      # Dataset loader with Albumentations augmentation
β”‚   └── train.py                        # Training engine (mixed precision, wandb)
β”œβ”€β”€ detection/
β”‚   β”œβ”€β”€ config.py                       # YOLOv4-tiny hyperparameters and anchors
β”‚   β”œβ”€β”€ dataset.py                      # YOLO dataset loader with mosaic augmentation
β”‚   └── train_yolo.py                   # YOLO training engine with COCO evaluation
β”œβ”€β”€ common/
β”‚   β”œβ”€β”€ loss.py                         # FocalLoss, Yolo_loss, Yolo_loss_general
β”‚   └── meters.py                       # AverageMeter, ProgressMeter
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ utils.py                        # Train/test splits, confusion matrices, visualization
β”‚   β”œβ”€β”€ detection_utils.py              # IoU / GIoU / DIoU / CIoU calculations
β”‚   β”œβ”€β”€ annot_classification_map.py     # Image-to-label mapping
β”‚   β”œβ”€β”€ annot_darknet2torch.py          # Darknet annotation format converter
β”‚   └── cvat_annotation_converter.py   # CVAT XML β†’ YOLO format converter
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ classification_model/           # Trained classification weights + ONNX
β”‚   β”œβ”€β”€ detection_model/                # YOLOv4-tiny weights + ONNX
β”‚   └── training/                       # Standalone training scripts for individual heads
β”œβ”€β”€ main.py                             # Usage examples
β”œβ”€β”€ test.py                             # Inference script
└── environment.yml                     # Conda environment

Setup

conda env create -f environment.yml
conda activate <env-name>

Key dependencies: PyTorch 1.7.1 (CUDA 10.2), TorchVision 0.8.2, OpenCV 3.4.2, Albumentations 0.5.2, ONNX 1.8.0 + ONNXRuntime 1.5.2, Weights & Biases.

Data Preparation

Annotations are created in CVAT and exported as XML. The converter handles bounding box extraction and label generation:

python utils/cvat_annotation_converter.py

This converts CVAT XML annotations to YOLO format and crops classification samples with their labels.

Training

Classification

from classification.train import Engine
from classification.model import TSAPMultiClassification

model = TSAPMultiClassification(pretrained=True)
engine = Engine(model, train_loader, val_loader, device_ids=[0,1])
engine.train(epochs=200)

Hyperparameters: batch size 128, SGD with momentum 0.9, OneCycleLR (lr=3e-3), image size 200Γ—200.

Detection

from detection.train_yolo import Engine_YOLO

engine = Engine_YOLO(cfg_path, weights_path, train_loader, val_loader)
engine.train(max_batches=30000)

Hyperparameters: batch size 64 (16 subdivisions), learning rate 0.001 with burn-in, image size 608Γ—608.

Inference

python test.py

The inference pipeline:

  1. Resize to 200Γ—200 (classification) or 608Γ—608 (detection)
  2. Normalize using dataset mean/std
  3. Run forward pass
  4. Apply sigmoid + argmax; filter predictions below threshold 0.3

Models

Pre-trained weights are in models/:

File Description
classification_model/tsap_bin_classifer.pt Multi-task fullness + shape classifier
classification_model/shape_classifier_resnet34.pt Shape-only classifier
classification_model/TSAP_classifier_dynamic.onnx ONNX (dynamic batch)
detection_model/tsap-detection.weights Darknet format detection weights
detection_model/tsap-detection.cfg Darknet config
detection_model/TSAP_detection_dynamic.onnx ONNX (dynamic batch)

Architecture

Classification β€” TSAPMultiClassification

Input (200Γ—200)
    └── ResNet34 backbone (shared)
         β”œβ”€β”€ Conv head β†’ Linear(128) β†’ Fullness logits (5)
         └── Conv head β†’ Linear(128) β†’ Shape logits (8)

Training uses BCEWithLogitsLoss for both heads simultaneously.

Detection β€” YOLOv4-tiny

  • Input: 608Γ—608, 6 object classes
  • 2 detection scales (32Γ— and 16Γ— stride), 3 anchors each
  • Losses: XY, WH, objectness, class (IoU-based anchor assignment)

Experiment Tracking

Training integrates with Weights & Biases for metric logging and visualization. Set your API key before training:

wandb login
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Sharath33/WasteManagement

Quantized
(1)
this model