Spaces:
Build error
Build error
File size: 1,069 Bytes
5b08af8 09b0ff7 5b08af8 09b0ff7 5b08af8 09b0ff7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | ---
title: LipNet Silent Speech Recognition
emoji: π
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
---
# LipNet β Silent Speech Recognition
A deep learning model that reads lips from video and predicts spoken text β no audio required.
## Model Architecture
- **3Γ Conv3D** layers for spatiotemporal feature extraction
- **2Γ Bidirectional LSTM** layers for sequence modelling
- **CTC Loss** for sequence-to-sequence alignment
- Input: 75 frames of mouth region (46Γ140 px, grayscale)
## How to Use
1. Upload a short `.mpg` or `.mp4` video showing a frontal face
2. Click **READ LIPS**
3. The predicted sentence appears on the right
## Dataset
Trained on the [GRID Corpus](https://spandh.dcs.shef.ac.uk/gridcorpus/) β Speaker S1.
Vocabulary: `a-z`, digits `1-9`, punctuation `'?!` and space (40 characters total).
## Files
```
app.py β Gradio app + inference
requirements.txt β Dependencies
models/checkpoint.weights.h5 β Model weights (upload manually)
```
|