pupillometry / README.md
txarst's picture
Update README.md
659d40d verified
---
title: PupilSense
emoji: πŸ‘οΈ
colorFrom: red
colorTo: pink
sdk: gradio
sdk_version: 5.38.2
app_file: app.py
pinned: false
---
# πŸ‘οΈ PupilSense πŸ‘οΈπŸ•΅οΈβ€β™‚οΈ
PupilSense is a deep learning-powered application for estimating pupil diameter from images and videos. It uses trained ResNet models with Class Activation Mapping (CAM) for interpretable predictions.
## Features
- **Image Processing**: Upload images to get instant pupil diameter estimates
- **Video Processing**: Analyze videos frame-by-frame for temporal pupil diameter analysis
- **Model Selection**: Choose between ResNet18 and ResNet50 architectures
- **Pupil Selection**: Analyze left pupil, right pupil, or both
- **Blink Detection**: Automatically detect and handle blinks in the analysis
- **CAM Visualization**: See which parts of the eye the model focuses on for predictions
- **API Access**: Full Gradio API support for programmatic access
## Usage
### Web Interface
Simply upload an image or video file and configure your analysis parameters:
- Select pupil(s) to analyze (left, right, or both)
- Choose the model architecture (ResNet18 or ResNet50)
- Enable/disable blink detection
- Click process to get results
### API Access
The Gradio interface provides automatic API endpoints. You can access the API documentation at `/docs` when the app is running.
Example API usage:
```python
import requests
import json
# For image processing
files = {"image_input": open("your_image.jpg", "rb")}
data = {
"pupil_selection": "both",
"tv_model": "ResNet18",
"blink_detection": True
}
response = requests.post("https://your-space-url/api/predict", files=files, data=data)
```
## Model Information
The application uses pre-trained ResNet models specifically trained for pupil diameter estimation:
- **ResNet18**: Faster inference, good accuracy
- **ResNet50**: Higher accuracy, slower inference
Both models support:
- Input resolution: 32x64 pixels (eye region)
- Output: Pupil diameter in millimeters
- CAM visualization for model interpretability
## Technical Details
- **Face Detection**: MediaPipe for robust face and eye detection
- **Preprocessing**: Automatic eye region extraction and normalization
- **Deep Learning**: PyTorch-based ResNet models
- **Visualization**: Matplotlib for result plotting and CAM overlays
- **Video Support**: Frame-by-frame analysis with temporal plotting
## Installation & Setup
### Local Development
1. **Clone the repository**
```bash
git clone <repository-url>
cd pupilsense
```
2. **Create virtual environment**
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. **Install dependencies**
```bash
pip install -r requirements.txt
```
4. **Run the application**
```bash
python app.py
```
The app will be available at `http://localhost:7860`
### Hugging Face Spaces Deployment
1. **Create a new Space** on Hugging Face with Gradio SDK
2. **Upload all files** from the pupilsense directory
3. **Ensure the following files are present:**
- `app.py` (main application file)
- `gradio_app.py` (Gradio interface)
- `gradio_utils.py` (utility functions)
- `requirements.txt` (dependencies)
- `README.md` (this file with proper YAML header)
- `pre_trained_models/` (model files)
- All other supporting files
## Known Issues & Troubleshooting
### MediaPipe Issues
- **Issue**: Segmentation fault or MediaPipe errors in headless environments
- **Solution**: The app includes error handling for MediaPipe failures. In production environments, ensure proper GPU/display drivers are available.
### Model Loading
- **Issue**: Model files not found
- **Solution**: Ensure `pre_trained_models/` directory contains the required `.pt` files for both ResNet18 and ResNet50 models.
### Memory Usage
- **Issue**: High memory usage with large videos
- **Solution**: The app automatically resizes frames to 640x480 to manage memory usage.
## File Structure
```
pupilsense/
β”œβ”€β”€ app.py # Main application entry point
β”œβ”€β”€ gradio_app.py # Gradio interface definition
β”œβ”€β”€ gradio_utils.py # Utility functions (MediaPipe-free)
β”œβ”€β”€ app_utils.py # Original Streamlit utilities (legacy)
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ README.md # This file
β”œβ”€β”€ config.yml # Configuration file
β”œβ”€β”€ registry.py # Model registry
β”œβ”€β”€ registry_utils.py # Registry utilities
β”œβ”€β”€ utils.py # General utilities
β”œβ”€β”€ pre_trained_models/ # Trained model files
β”‚ β”œβ”€β”€ ResNet18/
β”‚ β”‚ β”œβ”€β”€ left_eye.pt
β”‚ β”‚ └── right_eye.pt
β”‚ └── ResNet50/
β”‚ β”œβ”€β”€ left_eye.pt
β”‚ └── right_eye.pt
β”œβ”€β”€ preprocessing/ # Data preprocessing modules
β”œβ”€β”€ feature_extraction/ # Feature extraction modules
β”œβ”€β”€ registrations/ # Model registration modules
└── sample_videos/ # Sample video files
```
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Test thoroughly
5. Submit a pull request
## License
See LICENSE file for details.
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference