Spaces:

txarst
/

pupillometry

Sleeping

File size: 5,301 Bytes

---
title: PupilSense
emoji: 👁️
colorFrom: red
colorTo: pink
sdk: gradio
sdk_version: 5.38.2
app_file: app.py
pinned: false
---

# 👁️ PupilSense 👁️🕵️‍♂️

PupilSense is a deep learning-powered application for estimating pupil diameter from images and videos. It uses trained ResNet models with Class Activation Mapping (CAM) for interpretable predictions.

## Features

- **Image Processing**: Upload images to get instant pupil diameter estimates
- **Video Processing**: Analyze videos frame-by-frame for temporal pupil diameter analysis
- **Model Selection**: Choose between ResNet18 and ResNet50 architectures
- **Pupil Selection**: Analyze left pupil, right pupil, or both
- **Blink Detection**: Automatically detect and handle blinks in the analysis
- **CAM Visualization**: See which parts of the eye the model focuses on for predictions
- **API Access**: Full Gradio API support for programmatic access

## Usage

### Web Interface
Simply upload an image or video file and configure your analysis parameters:
- Select pupil(s) to analyze (left, right, or both)
- Choose the model architecture (ResNet18 or ResNet50)
- Enable/disable blink detection
- Click process to get results

### API Access
The Gradio interface provides automatic API endpoints. You can access the API documentation at `/docs` when the app is running.

Example API usage:
```python
import requests
import json

# For image processing
files = {"image_input": open("your_image.jpg", "rb")}
data = {
    "pupil_selection": "both",
    "tv_model": "ResNet18",
    "blink_detection": True
}
response = requests.post("https://your-space-url/api/predict", files=files, data=data)
```

## Model Information

The application uses pre-trained ResNet models specifically trained for pupil diameter estimation:
- **ResNet18**: Faster inference, good accuracy
- **ResNet50**: Higher accuracy, slower inference

Both models support:
- Input resolution: 32x64 pixels (eye region)
- Output: Pupil diameter in millimeters
- CAM visualization for model interpretability

## Technical Details

- **Face Detection**: MediaPipe for robust face and eye detection
- **Preprocessing**: Automatic eye region extraction and normalization
- **Deep Learning**: PyTorch-based ResNet models
- **Visualization**: Matplotlib for result plotting and CAM overlays
- **Video Support**: Frame-by-frame analysis with temporal plotting

## Installation & Setup

### Local Development

1. **Clone the repository**
```bash
git clone <repository-url>
cd pupilsense
```

2. **Create virtual environment**
```bash
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
```

3. **Install dependencies**
```bash
pip install -r requirements.txt
```

4. **Run the application**
```bash
python app.py
```

The app will be available at `http://localhost:7860`

### Hugging Face Spaces Deployment

1. **Create a new Space** on Hugging Face with Gradio SDK
2. **Upload all files** from the pupilsense directory
3. **Ensure the following files are present:**
   - `app.py` (main application file)
   - `gradio_app.py` (Gradio interface)
   - `gradio_utils.py` (utility functions)
   - `requirements.txt` (dependencies)
   - `README.md` (this file with proper YAML header)
   - `pre_trained_models/` (model files)
   - All other supporting files

## Known Issues & Troubleshooting

### MediaPipe Issues
- **Issue**: Segmentation fault or MediaPipe errors in headless environments
- **Solution**: The app includes error handling for MediaPipe failures. In production environments, ensure proper GPU/display drivers are available.

### Model Loading
- **Issue**: Model files not found
- **Solution**: Ensure `pre_trained_models/` directory contains the required `.pt` files for both ResNet18 and ResNet50 models.

### Memory Usage
- **Issue**: High memory usage with large videos
- **Solution**: The app automatically resizes frames to 640x480 to manage memory usage.

## File Structure

```
pupilsense/
├── app.py                 # Main application entry point
├── gradio_app.py         # Gradio interface definition
├── gradio_utils.py       # Utility functions (MediaPipe-free)
├── app_utils.py          # Original Streamlit utilities (legacy)
├── requirements.txt      # Python dependencies
├── README.md            # This file
├── config.yml           # Configuration file
├── registry.py          # Model registry
├── registry_utils.py    # Registry utilities
├── utils.py             # General utilities
├── pre_trained_models/  # Trained model files
│   ├── ResNet18/
│   │   ├── left_eye.pt
│   │   └── right_eye.pt
│   └── ResNet50/
│       ├── left_eye.pt
│       └── right_eye.pt
├── preprocessing/       # Data preprocessing modules
├── feature_extraction/  # Feature extraction modules
├── registrations/       # Model registration modules
└── sample_videos/       # Sample video files
```

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Test thoroughly
5. Submit a pull request

## License

See LICENSE file for details.

---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference