--- title: Lip Reading emoji: 👄 colorFrom: indigo colorTo: indigo sdk: gradio sdk_version: 5.5.0 app_file: app.py pinned: false --- # Lip Reading Lip-reading demo using TensorFlow, MediaPipe, and Gradio. Upload a short clip or record with your webcam to get a transcription generated from mouth movements. ## Features - Gradio UI with upload + webcam tabs - TensorFlow model loaded once and reused - MediaPipe lip cropping and normalization with frame caps for stability - Configurable ports, sharing, model path, and preprocessing thresholds via environment variables ## Quickstart (local) 1. Create a virtual environment - Windows: `python -m venv .venv && .\.venv\Scripts\Activate.ps1` - macOS/Linux: `python -m venv .venv && source .venv/bin/activate` 2. Install dependencies `pip install -r requirements.txt` 3. Run the app `python app.py` 4. Open the URL printed to the console (default http://127.0.0.1:7860). Set `GRADIO_SHARE=true` if you need a public link. ## Environment variables - `PORT` (default `7860`): Port for Gradio. - `GRADIO_SHARE` (`true`/`false`, default `false`): Whether to expose a public link. - `MAX_VIDEO_SIZE_MB` (default `1000`): Reject uploads larger than this. - `LIPNET_MODEL_PATH` (default `best_model_1_WER.keras`): Path to the saved model. - `LIPNET_TARGET_SIZE` (default `85`): Target square size for lip crops. - `LIPNET_MAX_FRAMES` (default `160`): Max frames processed per video to bound memory/time. - `LIPNET_DETECTION_CONFIDENCE` (default `0.5`): MediaPipe detection confidence. - `LIPNET_TRACKING_CONFIDENCE` (default `0.5`): MediaPipe tracking confidence. ## Project structure ``` app.py # Entry point best_model_1_WER.keras# Trained model weights lipnet/ __init__.py config.py # Runtime configuration model.py # Model loading, inference, decoding preprocessing.py # Lip detection, cropping, normalization ui.py # Gradio components and handlers requirements.txt ``` ## Usage tips - Keep videos short and ensure the mouth is well-lit and centered. - Supported inputs: MP4/AVI/MOV/MPG. - GPU improves speed; CPU also works but may be slower. - If no face is detected, check lighting, camera angle, and framing. ## Troubleshooting - **Model file missing**: Set `LIPNET_MODEL_PATH` to the correct `.keras` file. - **High memory use/OOM**: Lower `LIPNET_MAX_FRAMES` or reduce input resolution. - **Webcam not working**: Ensure browser permissions are granted for camera access. - **Mediapipe import error**: Reinstall with `pip install --force-reinstall mediapipe` (version >= 0.10). On Apple/ARM or Windows CPU-only, prefer the latest 0.10.x wheel.