init
Browse files
README.md
CHANGED
|
@@ -6,7 +6,7 @@ license: mit
|
|
| 6 |
|
| 7 |
## Introduction
|
| 8 |
|
| 9 |
-
|
| 10 |
|
| 11 |
## Features
|
| 12 |
|
|
@@ -51,7 +51,7 @@ LipCoordNet is an advanced neural network model designed for accurate lip readin
|
|
| 51 |
|
| 52 |
### Usage
|
| 53 |
|
| 54 |
-
To train the
|
| 55 |
|
| 56 |
```bash
|
| 57 |
python train.py
|
|
@@ -67,19 +67,19 @@ note: ffmpeg is required to convert video to image sequence and run the inferenc
|
|
| 67 |
|
| 68 |
## Model Architecture
|
| 69 |
|
| 70 |
-

|
| 71 |
|
| 72 |
## Training
|
| 73 |
|
| 74 |
+
This model is built on top of the [LipReading](https://github.com/wissemkarous/Lip-reading-Final-Year-Project) project on GitHub. The training process if similar to the original LipNet model, with the addition of landmark coordinates as a supplementary input. We used the pretrained weights from the original LipNet model as a starting point for training our model, froze the weights for the original LipNet layers, and trained the new layers for the landmark coordinates.
|
| 75 |
|
| 76 |
+
The dataset used to train this model is the [Lipreading dataset](https://huggingface.co/datasets/wissemkarous/lipreading). The dataset is not included in this repository, but can be downloaded from the link above.
|
| 77 |
|
| 78 |
+
Total training time: 12 h
|
| 79 |
Total epochs: 51
|
| 80 |
+
Training hardware: NVIDIA GeForce RTX 3050 6GB
|
| 81 |
|
| 82 |
+

|
| 83 |
|
| 84 |
For an interactive view of the training curves, please refer to the tensorboard logs in the `runs` directory.
|
| 85 |
Use this command to view the logs:
|
|
|
|
| 98 |
|
| 99 |
## Acknowledgments
|
| 100 |
|
| 101 |
+
This model, LipReading, has been developed for academic purposes as a final year project. Special thanks to everyone who provided assistance and all references
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 102 |
## Contact
|
| 103 |
|
| 104 |
Project Link: https://github.com/ffeew/LipCoordNet
|