Spaces:
Build error
Build error
A newer version of the Gradio SDK is available:
6.3.0
metadata
title: Image_Segmentation_
app_file: app.py
sdk: gradio
sdk_version: 5.23.1
Image Segmentation Toolkit
Overview
This project implements a comprehensive image segmentation toolkit that combines classical computer vision techniques with deep learning-based approaches. The application provides an interactive interface to compare different segmentation algorithms on user-provided images.
Features
Classical Segmentation Methods:
- Otsu's Thresholding: Optimal global thresholding for binary segmentation
- K-means Clustering: Color-based segmentation with adjustable clusters
- SLIC (Simple Linear Iterative Clustering): Superpixel segmentation
- Watershed Algorithm: Gradient-based segmentation for separating touching objects
- Felzenszwalb Algorithm: Graph-based segmentation with adaptive thresholding
Deep Learning Models:
- SegNet with EfficientNet B0 backbone: Pretrained semantic segmentation model
- SegNet with VGG backbone: Alternative architecture for comparison
Ensemble Methods:
- Otsu + SegNet: Combining boundary information from Otsu with semantic labels from SegNet
- Custom ensemble segmentation with adjustable parameters
Installation
Prerequisites
- Python 3.8+
- PyTorch 1.10+
- CUDA-compatible GPU (recommended)
Setup
- Clone the repository:
git clone https://github.com/yourusername/CSL7360_Project.git
cd CSL7360_Project
- Create and activate a virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install required packages:
pip install -r requirements.txt
- Download pretrained models:
python download_models.py
The application will also automatically download models when first launched.
Usage
Running the Application
Start the Gradio web interface:
python app.py
The interface will be available at http://127.0.0.1:7860 in your web browser.
Using the Interface
- Select a segmentation method from the tabs at the top
- Upload an image using the file picker
- Adjust algorithm parameters if available
- Click the "Segment this image" button
- View the results in the display area
Algorithm Parameters
Otsu's Method
- No parameters, fully automatic threshold selection
K-means Segmentation
- Number of Clusters (K): Controls how many color groups to segment into
SLIC Segmentation
- Number of superpixels: Controls the granularity of segmentation
- Compactness factor: Controls how much superpixels adhere to boundaries
- Number of iterations: Controls refinement of superpixel boundaries
Felzenszwalb Algorithm
- Sigma: Gaussian pre-processing smoothing parameter
- K value: Controls segment size preference
- Min Size Factor: Minimum component size
Ensemble Segmentation
- Boundary Refinement Weight: Controls influence of classical methods on deep learning boundaries
Project Structure
CSL7360_Project/
βββ app.py # Main application with pretrained models
βββ experiments/ # Implementation of segmentation algorithms
β βββ ensemble_method.py # Ensemble segmentation implementation
β βββ felzenszwalb_segmentation/ # Felzenszwalb algorithm implementation
β βββ kmeans_segmenter.py # K-means segmentation implementation
β βββ enhanced_kmeans_segmenter.py # SLIC implementation
β βββ otsu_segmenter.py # Otsu thresholding implementation
β βββ watershed_segmenter.py # Watershed algorithm implementation
β βββ SegNet/ # Deep learning models
β βββ efficient_b0_backbone/ # EfficientNet backbone for SegNet
β βββ vgg_backbone/ # VGG backbone for SegNet
βββ saved_models/ # Directory for pretrained weights
βββ requirements.txt # Package dependencies
Examples
The application works well on a variety of images:
- Natural scenes
- Urban environments
- Medical images
- Aerial/satellite imagery
- Objects with clear boundaries
Technologies Used
- PyTorch: Deep learning framework
- OpenCV: Classical computer vision algorithms
- NumPy: Numerical computations
- PIL/Pillow: Image loading and manipulation
- Gradio: Interactive web interface
- Matplotlib: Visualization of results
Credits
- Built as part of CSL7360 course project
- Uses pretrained models based on Pascal VOC and CamVid datasets
- Implements algorithms from classical computer vision literature
License
This project is available under the MIT License.