Spaces:

omm7
/

Lip-Reader

Build error

Lip-Reader / README.md

First Commit

09b0ff7 verified 17 days ago

1.07 kB

	---
	title: LipNet Silent Speech Recognition
	emoji: 👄
	colorFrom: purple
	colorTo: blue
	sdk: gradio
	sdk_version: 4.44.0
	app_file: app.py
	pinned: false
	---

	# LipNet — Silent Speech Recognition

	A deep learning model that reads lips from video and predicts spoken text — no audio required.

	## Model Architecture
	- 3× Conv3D layers for spatiotemporal feature extraction
	- 2× Bidirectional LSTM layers for sequence modelling
	- CTC Loss for sequence-to-sequence alignment
	- Input: 75 frames of mouth region (46×140 px, grayscale)

	## How to Use
	1. Upload a short `.mpg` or `.mp4` video showing a frontal face
	2. Click READ LIPS
	3. The predicted sentence appears on the right

	## Dataset
	Trained on the [GRID Corpus](https://spandh.dcs.shef.ac.uk/gridcorpus/) — Speaker S1.
	Vocabulary: `a-z`, digits `1-9`, punctuation `'?!` and space (40 characters total).

	## Files
	```
	app.py ← Gradio app + inference
	requirements.txt ← Dependencies
	models/checkpoint.weights.h5 ← Model weights (upload manually)
	```