Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available: 6.15.0
metadata
title: Lip Reading
emoji: 👄
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 5.5.0
app_file: app.py
pinned: false
Lip Reading
Lip-reading demo using TensorFlow, MediaPipe, and Gradio. Upload a short clip or record with your webcam to get a transcription generated from mouth movements.
Features
- Gradio UI with upload + webcam tabs
- TensorFlow model loaded once and reused
- MediaPipe lip cropping and normalization with frame caps for stability
- Configurable ports, sharing, model path, and preprocessing thresholds via environment variables
Quickstart (local)
- Create a virtual environment
- Windows:
python -m venv .venv && .\.venv\Scripts\Activate.ps1 - macOS/Linux:
python -m venv .venv && source .venv/bin/activate
- Windows:
- Install dependencies
pip install -r requirements.txt - Run the app
python app.py - Open the URL printed to the console (default http://127.0.0.1:7860). Set
GRADIO_SHARE=trueif you need a public link.
Environment variables
PORT(default7860): Port for Gradio.GRADIO_SHARE(true/false, defaultfalse): Whether to expose a public link.MAX_VIDEO_SIZE_MB(default1000): Reject uploads larger than this.LIPNET_MODEL_PATH(defaultbest_model_1_WER.keras): Path to the saved model.LIPNET_TARGET_SIZE(default85): Target square size for lip crops.LIPNET_MAX_FRAMES(default160): Max frames processed per video to bound memory/time.LIPNET_DETECTION_CONFIDENCE(default0.5): MediaPipe detection confidence.LIPNET_TRACKING_CONFIDENCE(default0.5): MediaPipe tracking confidence.
Project structure
app.py # Entry point
best_model_1_WER.keras# Trained model weights
lipnet/
__init__.py
config.py # Runtime configuration
model.py # Model loading, inference, decoding
preprocessing.py # Lip detection, cropping, normalization
ui.py # Gradio components and handlers
requirements.txt
Usage tips
- Keep videos short and ensure the mouth is well-lit and centered.
- Supported inputs: MP4/AVI/MOV/MPG.
- GPU improves speed; CPU also works but may be slower.
- If no face is detected, check lighting, camera angle, and framing.
Troubleshooting
- Model file missing: Set
LIPNET_MODEL_PATHto the correct.kerasfile. - High memory use/OOM: Lower
LIPNET_MAX_FRAMESor reduce input resolution. - Webcam not working: Ensure browser permissions are granted for camera access.
- Mediapipe import error: Reinstall with
pip install --force-reinstall mediapipe(version >= 0.10). On Apple/ARM or Windows CPU-only, prefer the latest 0.10.x wheel.