Spaces:
Running
Running
File size: 1,027 Bytes
7fe13bb 798fdc2 7fe13bb 798fdc2 7fe13bb 798fdc2 7fe13bb 798fdc2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | ---
title: LipNet Silent Speech Recognition
emoji: π
colorFrom: purple
colorTo: indigo
sdk: docker
pinned: false
---
# LipNet β Silent Speech Recognition
Reads lips from video and predicts spoken text β no audio required.
## File Structure
```
βββ Dockerfile
βββ requirements.txt
βββ README.md
βββ models/
β βββ checkpoint.weights.h5 β upload your weights here
βββ app/
βββ app.py
βββ modelutil.py
βββ utils.py
βββ data/
βββ s1/
β βββ *.mpg β sample videos from GRID corpus
βββ alignments/
βββ s1/
βββ *.align β alignment files
```
## Model
- **Input**: 75 frames, mouth crop 46Γ140px, grayscale, z-score normalized
- **Architecture**: Conv3D Γ 3 β Reshape β BiLSTM Γ 2 β Dense(41) β CTC
- **Dataset**: GRID Corpus Speaker S1
- **Vocab**: aβz, 1β9, `'`, `?`, `!`, space (40 chars + CTC blank = 41)
|