pupillometry / README.md
txarst's picture
Update README.md
659d40d verified

A newer version of the Gradio SDK is available: 6.2.0

Upgrade
metadata
title: PupilSense
emoji: πŸ‘οΈ
colorFrom: red
colorTo: pink
sdk: gradio
sdk_version: 5.38.2
app_file: app.py
pinned: false

πŸ‘οΈ PupilSense πŸ‘οΈπŸ•΅οΈβ€β™‚οΈ

PupilSense is a deep learning-powered application for estimating pupil diameter from images and videos. It uses trained ResNet models with Class Activation Mapping (CAM) for interpretable predictions.

Features

  • Image Processing: Upload images to get instant pupil diameter estimates
  • Video Processing: Analyze videos frame-by-frame for temporal pupil diameter analysis
  • Model Selection: Choose between ResNet18 and ResNet50 architectures
  • Pupil Selection: Analyze left pupil, right pupil, or both
  • Blink Detection: Automatically detect and handle blinks in the analysis
  • CAM Visualization: See which parts of the eye the model focuses on for predictions
  • API Access: Full Gradio API support for programmatic access

Usage

Web Interface

Simply upload an image or video file and configure your analysis parameters:

  • Select pupil(s) to analyze (left, right, or both)
  • Choose the model architecture (ResNet18 or ResNet50)
  • Enable/disable blink detection
  • Click process to get results

API Access

The Gradio interface provides automatic API endpoints. You can access the API documentation at /docs when the app is running.

Example API usage:

import requests
import json

# For image processing
files = {"image_input": open("your_image.jpg", "rb")}
data = {
    "pupil_selection": "both",
    "tv_model": "ResNet18",
    "blink_detection": True
}
response = requests.post("https://your-space-url/api/predict", files=files, data=data)

Model Information

The application uses pre-trained ResNet models specifically trained for pupil diameter estimation:

  • ResNet18: Faster inference, good accuracy
  • ResNet50: Higher accuracy, slower inference

Both models support:

  • Input resolution: 32x64 pixels (eye region)
  • Output: Pupil diameter in millimeters
  • CAM visualization for model interpretability

Technical Details

  • Face Detection: MediaPipe for robust face and eye detection
  • Preprocessing: Automatic eye region extraction and normalization
  • Deep Learning: PyTorch-based ResNet models
  • Visualization: Matplotlib for result plotting and CAM overlays
  • Video Support: Frame-by-frame analysis with temporal plotting

Installation & Setup

Local Development

  1. Clone the repository
git clone <repository-url>
cd pupilsense
  1. Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies
pip install -r requirements.txt
  1. Run the application
python app.py

The app will be available at http://localhost:7860

Hugging Face Spaces Deployment

  1. Create a new Space on Hugging Face with Gradio SDK
  2. Upload all files from the pupilsense directory
  3. Ensure the following files are present:
    • app.py (main application file)
    • gradio_app.py (Gradio interface)
    • gradio_utils.py (utility functions)
    • requirements.txt (dependencies)
    • README.md (this file with proper YAML header)
    • pre_trained_models/ (model files)
    • All other supporting files

Known Issues & Troubleshooting

MediaPipe Issues

  • Issue: Segmentation fault or MediaPipe errors in headless environments
  • Solution: The app includes error handling for MediaPipe failures. In production environments, ensure proper GPU/display drivers are available.

Model Loading

  • Issue: Model files not found
  • Solution: Ensure pre_trained_models/ directory contains the required .pt files for both ResNet18 and ResNet50 models.

Memory Usage

  • Issue: High memory usage with large videos
  • Solution: The app automatically resizes frames to 640x480 to manage memory usage.

File Structure

pupilsense/
β”œβ”€β”€ app.py                 # Main application entry point
β”œβ”€β”€ gradio_app.py         # Gradio interface definition
β”œβ”€β”€ gradio_utils.py       # Utility functions (MediaPipe-free)
β”œβ”€β”€ app_utils.py          # Original Streamlit utilities (legacy)
β”œβ”€β”€ requirements.txt      # Python dependencies
β”œβ”€β”€ README.md            # This file
β”œβ”€β”€ config.yml           # Configuration file
β”œβ”€β”€ registry.py          # Model registry
β”œβ”€β”€ registry_utils.py    # Registry utilities
β”œβ”€β”€ utils.py             # General utilities
β”œβ”€β”€ pre_trained_models/  # Trained model files
β”‚   β”œβ”€β”€ ResNet18/
β”‚   β”‚   β”œβ”€β”€ left_eye.pt
β”‚   β”‚   └── right_eye.pt
β”‚   └── ResNet50/
β”‚       β”œβ”€β”€ left_eye.pt
β”‚       └── right_eye.pt
β”œβ”€β”€ preprocessing/       # Data preprocessing modules
β”œβ”€β”€ feature_extraction/  # Feature extraction modules
β”œβ”€β”€ registrations/       # Model registration modules
└── sample_videos/       # Sample video files

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Test thoroughly
  5. Submit a pull request

License

See LICENSE file for details.


Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference