3DCvT on LRW
This repository provides the released checkpoint and evaluation artifacts for an unofficial PyTorch reproduction of:
A Lip Reading Method Based on 3D Convolutional Vision Transformer
Code repository:
Model Summary
- Task: English word-level lip reading
- Dataset: LRW
- Number of classes: 500
- Framework: PyTorch
- Architecture: 3D CNN + CvT + BiGRU
Released Files
best_model.pth: released checkpointsha256.txt: checksum for the checkpointlogs/train.log: selected training logresults/per_class_acc_lrw_val.csv: per-class validation summaryplots/learning_curve.png: learning curve exported from training
Training Setup
Training settings from the released run:
- GPUs: 1 GPU
- Per-step batch size: 64
- Gradient accumulation: 4
- Effective batch size: 256
- Epochs: 150
- Optimizer: Adam
- Weight decay: 1e-4
- Learning rate: 6e-4
- Warmup epochs: 5
- Mixed precision: AMP enabled
torch.compile: disabled
Evaluation Result
| Dataset | Split | Metric | Value |
|---|---|---|---|
| LRW | Validation | Top-1 Accuracy | 83.91% |
Intended Use
This checkpoint is intended for:
- research reproduction
- benchmark comparison
- qualitative inference demos
It is not intended as a production-ready commercial lip-reading system.
Limitations
- Performance may vary across preprocessing pipelines
- This release does not include the raw LRW dataset
- Users must obtain the dataset according to its own terms
Usage
Example inference command:
python inference.py \
--dataset lrw \
--video_path /path/to/sample.mp4 \
--checkpoint /path/to/best_model.pth \
--gpu 0
Notes
- The checkpoint is released for reproducibility
- Please use the matching code version when possible
- Local source artifact names were
best_model_for_lrw.pthandtrain_lrw.log
Citation
If you use this release, please cite the original paper:
@article{wu2022lip,
title={A Lip Reading Method Based on 3D Convolutional Vision Transformer},
author={Wu, Jiafeng and others},
journal={IEEE Access},
year={2022}
}
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support