Spaces:
Build error
Build error
A newer version of the Gradio SDK is available: 6.11.0
metadata
title: LipNet Silent Speech Recognition
emoji: π
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
LipNet β Silent Speech Recognition
A deep learning model that reads lips from video and predicts spoken text β no audio required.
Model Architecture
- 3Γ Conv3D layers for spatiotemporal feature extraction
- 2Γ Bidirectional LSTM layers for sequence modelling
- CTC Loss for sequence-to-sequence alignment
- Input: 75 frames of mouth region (46Γ140 px, grayscale)
How to Use
- Upload a short
.mpgor.mp4video showing a frontal face - Click READ LIPS
- The predicted sentence appears on the right
Dataset
Trained on the GRID Corpus β Speaker S1.
Vocabulary: a-z, digits 1-9, punctuation '?! and space (40 characters total).
Files
app.py β Gradio app + inference
requirements.txt β Dependencies
models/checkpoint.weights.h5 β Model weights (upload manually)