Spaces:

omm7
/

lip_reader

Running

App Files Files Community

lip_reader / README.md

omm7

Upload README.md with huggingface_hub

798fdc2 verified 17 days ago

preview code

raw

history blame contribute delete

1.03 kB

metadata

title: LipNet Silent Speech Recognition
emoji: 👄
colorFrom: purple
colorTo: indigo
sdk: docker
pinned: false

LipNet — Silent Speech Recognition

Reads lips from video and predicts spoken text — no audio required.

File Structure

├── Dockerfile
├── requirements.txt
├── README.md
├── models/
│   └── checkpoint.weights.h5      ← upload your weights here
└── app/
    ├── app.py
    ├── modelutil.py
    ├── utils.py
    └── data/
        ├── s1/
        │   └── *.mpg              ← sample videos from GRID corpus
        └── alignments/
            └── s1/
                └── *.align        ← alignment files

Model

Input: 75 frames, mouth crop 46×140px, grayscale, z-score normalized
Architecture: Conv3D × 3 → Reshape → BiLSTM × 2 → Dense(41) → CTC
Dataset: GRID Corpus Speaker S1
Vocab: a–z, 1–9, ', ?, !, space (40 chars + CTC blank = 41)