Biswajeet1 commited on
Commit
b80da27
·
verified ·
1 Parent(s): 0302518

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -140
README.md CHANGED
@@ -1,140 +1,9 @@
1
- # VisionExtract: AI-Powered Subject Isolation System
2
-
3
- ![VisionExtract Banner](docs/images/banner.png)
4
-
5
- [![Python 3.10+](https://img.shields.io/badge/Python-3.10+-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/)
6
- [![PyTorch](https://img.shields.io/badge/PyTorch-EE4C2C?style=for-the-badge&logo=pytorch&logoColor=white)](https://pytorch.org/)
7
- [![Streamlit](https://img.shields.io/badge/Streamlit-FF4B4B?style=for-the-badge&logo=Streamlit&logoColor=white)](https://streamlit.io/)
8
- [![OpenCV](https://img.shields.io/badge/OpenCV-5C3EE8?style=for-the-badge&logo=opencv&logoColor=white)](https://opencv.org/)
9
-
10
- ## 🎯 Project Overview
11
-
12
- **VisionExtract** is a specialized machine learning solution designed to automatically detect and extract the main subject from any given image. Built for professional automation, the system isolates the foreground subject and renders the background pixels as complete black, creating a high-fidelity "cutout" for use in digital art, photography, and augmented reality.
13
-
14
- ### 📝 Project Statement
15
- > "The goal of this project is to build a machine learning model capable of automatically extracting the main subject from an image. The output is a new image where only the subject is displayed as in the original photo, while the rest of the pixels are set to black."
16
-
17
- ---
18
-
19
- ## 🚀 Key Features
20
-
21
- * **⚡ Automated Subject Isolation**: Intelligent detection and extraction of primary subjects across diverse categories.
22
- * **🧩 Aspect-Ratio Awareness**: Advanced preprocessing using **LongestMaxSize** to ensure subjects maintain their natural proportions without distortion.
23
- * **🔄 Virtual Background Integration**: Real-world application of isolation technology allowing real-time subject matting onto Office, Nature, and Studio environments.
24
- * **🖼️ High-Fidelity Alpha Blending**: Smooth, anti-aliased edge transitions for professional-grade matting.
25
- * **📊 Production Dashboard**: A premium Streamlit-based interface featuring real-time performance metrics and batch processing capabilities.
26
-
27
- ---
28
-
29
- ## 🛠️ Technical Stack
30
-
31
- * **Architecture**: ResNet34-UNet (Transfer Learning)
32
- * **Framework**: PyTorch
33
- * **Preprocessing**: Albumentations (Standardized Evaluation Pipeline)
34
- * **Frontend**: Streamlit (AI Showcase Dashboard)
35
- * **Acceleration**: CUDA Support with AMP (Automatic Mixed Precision)
36
-
37
- ---
38
-
39
- ## 📖 Implementation Workflow
40
-
41
- 1. **Architecture**: Utilizes a deep **UNet** structure with a pre-trained **ResNet34 backbone** for high-precision spatial feature extraction.
42
- 2. **Training**: Optimized over **110 epochs** using **IoU-based checkpointing** to ensure the most accurate weights.
43
- 3. **Inference**: A standardized 256px resolution pipeline ensures architectural consistency and sub-second processing speeds.
44
-
45
- ---
46
-
47
- ## 📉 Performance Benchmarks
48
-
49
- Following a **110-epoch training cycle** (including a 10-epoch **Refinement Phase** at an optimized learning rate of `0.00005`), the model achieved the following benchmarks:
50
-
51
- | Metric | Achievement |
52
- | :--- | :--- |
53
- | **Model Architecture** | **ResNet34-UNet** |
54
- | **Mean IoU** | **0.64+** |
55
- | **Dice Score** | **0.78+** |
56
- | **Inference Time** | **~0.15s (GPU accelerated)** |
57
-
58
- ### 📊 Model Comparison
59
- | Model | IoU |
60
- | :--- | :--- |
61
- | **UNet** | **0.47** |
62
- | **ResNetUNet** | **0.62** |
63
-
64
- ---
65
-
66
- ## 🖼️ Visual Results (Gallery)
67
-
68
- The following samples from the `outputs/` folder demonstrate the final refined output:
69
-
70
- | Input Image | Isolated Subject (VisionExtract) |
71
- | :---: | :---: |
72
- | ![Input 1](outputs/Input1.jpg) | ![Output 1](outputs/Output1.jpg) |
73
- | ![Input 2](outputs/Input2.jpg) | ![Output 2](outputs/Output2.jpg) |
74
- | ![Input 3](outputs/Input3.jpg) | ![Output 3](outputs/Output3.jpg) |
75
-
76
- ---
77
-
78
- ## 📂 Project Structure
79
-
80
- ```text
81
- VisionExtract/
82
- ├── src/ # Production Logic (Model, Training, Inference, App)
83
- ├── outputs/ # Sample Results (Inputs & Predicted Cutouts)
84
- ├── docs/ # Project Assets (Banners, Backgrounds, Documentation)
85
- ├── checkpoints/ # Trained Model Weights (.pth)
86
- ├── requirements.txt # Dependency Configuration
87
- └── README.md # Technical Overview
88
- ```
89
-
90
- ---
91
-
92
- ## 🏃 Getting Started
93
-
94
- ### 1. Environment Setup
95
- ```bash
96
- git clone https://github.com/biswajeet111/VisionExtract.git
97
- cd VisionExtract
98
- python -m venv venv
99
- venv\Scripts\activate
100
- pip install -r requirements.txt
101
- ```
102
-
103
- ### 2. Launching the Web Showcase
104
- Experience the real-time extraction engine and background switcher.
105
- ```bash
106
- streamlit run src/app.py
107
- ```
108
-
109
- ### 3. Command Line Interface (CLI)
110
- ```bash
111
- # Single Image Processing
112
- python src/inference.py --image path/to/sample.jpg --display
113
- ```
114
-
115
- ### 4. Model Training & Refinement
116
- ```bash
117
- # Full Training Cycle
118
- python src/train.py
119
- ```
120
-
121
- ---
122
-
123
- ## ⚠️ Limitations
124
-
125
- While VisionExtract is highly effective, it has certain constraints:
126
- - **Struggles on extremely crowded scenes**: Multiple overlapping subjects can lead to merged or incomplete masks.
127
- - **High-resolution increases inference time**: Processing images significantly larger than the base 256px resolution requires more VRAM and compute.
128
- - **Small object segmentation may vary**: Tiny details (like thin strands of hair or distant objects) may be smoothed out during upscale.
129
-
130
- ---
131
-
132
- ## 👤 Author
133
-
134
- **Biswajeet Kumar**
135
- * **Portfolio**: [GitHub](https://github.com/biswajeet111)
136
- * **Connect**: [LinkedIn](https://www.linkedin.com/in/biswajeet-kumar-a70043362)
137
-
138
- ---
139
-
140
- *Developed as a high-performance solution for Automated Subject Isolation and AI Segmentation.*
 
1
+ ---
2
+ title: VisionExtract AI
3
+ emoji: 🧠
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: docker
7
+ app_file: app.py
8
+ pinned: false
9
+ ---