Spaces:

omm7
/

lip_reader

Running

lip_reader / README.md

Upload README.md with huggingface_hub

798fdc2 verified 17 days ago

1.03 kB

	---
	title: LipNet Silent Speech Recognition
	emoji: 👄
	colorFrom: purple
	colorTo: indigo
	sdk: docker
	pinned: false
	---

	# LipNet — Silent Speech Recognition

	Reads lips from video and predicts spoken text — no audio required.

	## File Structure
	```
	├── Dockerfile
	├── requirements.txt
	├── README.md
	├── models/
	│ └── checkpoint.weights.h5 ← upload your weights here
	└── app/
	├── app.py
	├── modelutil.py
	├── utils.py
	└── data/
	├── s1/
	│ └── *.mpg ← sample videos from GRID corpus
	└── alignments/
	└── s1/
	└── *.align ← alignment files
	```

	## Model
	- Input: 75 frames, mouth crop 46×140px, grayscale, z-score normalized
	- Architecture: Conv3D × 3 → Reshape → BiLSTM × 2 → Dense(41) → CTC
	- Dataset: GRID Corpus Speaker S1
	- Vocab: a–z, 1–9, `'`, `?`, `!`, space (40 chars + CTC blank = 41)