File size: 5,301 Bytes
f0e5caa
 
 
 
 
 
659d40d
f0e5caa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
659d40d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
---
title: PupilSense
emoji: πŸ‘οΈ
colorFrom: red
colorTo: pink
sdk: gradio
sdk_version: 5.38.2
app_file: app.py
pinned: false
---

# πŸ‘οΈ PupilSense πŸ‘οΈπŸ•΅οΈβ€β™‚οΈ

PupilSense is a deep learning-powered application for estimating pupil diameter from images and videos. It uses trained ResNet models with Class Activation Mapping (CAM) for interpretable predictions.

## Features

- **Image Processing**: Upload images to get instant pupil diameter estimates
- **Video Processing**: Analyze videos frame-by-frame for temporal pupil diameter analysis
- **Model Selection**: Choose between ResNet18 and ResNet50 architectures
- **Pupil Selection**: Analyze left pupil, right pupil, or both
- **Blink Detection**: Automatically detect and handle blinks in the analysis
- **CAM Visualization**: See which parts of the eye the model focuses on for predictions
- **API Access**: Full Gradio API support for programmatic access

## Usage

### Web Interface
Simply upload an image or video file and configure your analysis parameters:
- Select pupil(s) to analyze (left, right, or both)
- Choose the model architecture (ResNet18 or ResNet50)
- Enable/disable blink detection
- Click process to get results

### API Access
The Gradio interface provides automatic API endpoints. You can access the API documentation at `/docs` when the app is running.

Example API usage:
```python
import requests
import json

# For image processing
files = {"image_input": open("your_image.jpg", "rb")}
data = {
    "pupil_selection": "both",
    "tv_model": "ResNet18",
    "blink_detection": True
}
response = requests.post("https://your-space-url/api/predict", files=files, data=data)
```

## Model Information

The application uses pre-trained ResNet models specifically trained for pupil diameter estimation:
- **ResNet18**: Faster inference, good accuracy
- **ResNet50**: Higher accuracy, slower inference

Both models support:
- Input resolution: 32x64 pixels (eye region)
- Output: Pupil diameter in millimeters
- CAM visualization for model interpretability

## Technical Details

- **Face Detection**: MediaPipe for robust face and eye detection
- **Preprocessing**: Automatic eye region extraction and normalization
- **Deep Learning**: PyTorch-based ResNet models
- **Visualization**: Matplotlib for result plotting and CAM overlays
- **Video Support**: Frame-by-frame analysis with temporal plotting

## Installation & Setup

### Local Development

1. **Clone the repository**
```bash
git clone <repository-url>
cd pupilsense
```

2. **Create virtual environment**
```bash
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
```

3. **Install dependencies**
```bash
pip install -r requirements.txt
```

4. **Run the application**
```bash
python app.py
```

The app will be available at `http://localhost:7860`

### Hugging Face Spaces Deployment

1. **Create a new Space** on Hugging Face with Gradio SDK
2. **Upload all files** from the pupilsense directory
3. **Ensure the following files are present:**
   - `app.py` (main application file)
   - `gradio_app.py` (Gradio interface)
   - `gradio_utils.py` (utility functions)
   - `requirements.txt` (dependencies)
   - `README.md` (this file with proper YAML header)
   - `pre_trained_models/` (model files)
   - All other supporting files

## Known Issues & Troubleshooting

### MediaPipe Issues
- **Issue**: Segmentation fault or MediaPipe errors in headless environments
- **Solution**: The app includes error handling for MediaPipe failures. In production environments, ensure proper GPU/display drivers are available.

### Model Loading
- **Issue**: Model files not found
- **Solution**: Ensure `pre_trained_models/` directory contains the required `.pt` files for both ResNet18 and ResNet50 models.

### Memory Usage
- **Issue**: High memory usage with large videos
- **Solution**: The app automatically resizes frames to 640x480 to manage memory usage.

## File Structure

```
pupilsense/
β”œβ”€β”€ app.py                 # Main application entry point
β”œβ”€β”€ gradio_app.py         # Gradio interface definition
β”œβ”€β”€ gradio_utils.py       # Utility functions (MediaPipe-free)
β”œβ”€β”€ app_utils.py          # Original Streamlit utilities (legacy)
β”œβ”€β”€ requirements.txt      # Python dependencies
β”œβ”€β”€ README.md            # This file
β”œβ”€β”€ config.yml           # Configuration file
β”œβ”€β”€ registry.py          # Model registry
β”œβ”€β”€ registry_utils.py    # Registry utilities
β”œβ”€β”€ utils.py             # General utilities
β”œβ”€β”€ pre_trained_models/  # Trained model files
β”‚   β”œβ”€β”€ ResNet18/
β”‚   β”‚   β”œβ”€β”€ left_eye.pt
β”‚   β”‚   └── right_eye.pt
β”‚   └── ResNet50/
β”‚       β”œβ”€β”€ left_eye.pt
β”‚       └── right_eye.pt
β”œβ”€β”€ preprocessing/       # Data preprocessing modules
β”œβ”€β”€ feature_extraction/  # Feature extraction modules
β”œβ”€β”€ registrations/       # Model registration modules
└── sample_videos/       # Sample video files
```

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Test thoroughly
5. Submit a pull request

## License

See LICENSE file for details.

---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference