--- license: cc-by-nc-4.0 task_categories: - image-segmentation tags: - mirror-detection - video-understanding - video-mirror-detection - scene-understanding - pytorch pretty_name: VMD-Net (Video Mirror Detection Network) --- # VMD-Net — Video Mirror Detection Network Pre-trained weights for **VMD-Net**, introduced in: > **Learning to Detect Mirrors from Videos via Dual Correspondences** > Jiaying Lin\*, Xin Tan\*, Rynson W. H. Lau > CVPR 2023 > [Paper](https://openaccess.thecvf.com/content/CVPR2023/papers/Lin_Learning_To_Detect_Mirrors_From_Videos_via_Dual_Correspondences_CVPR_2023_paper.pdf) · [Project Page](https://jiaying.link/cvpr2023-vmd/) · [Dataset (VMD-D)](https://huggingface.co/datasets/garrying/VMD-D) ## Model Summary VMD-Net detects mirrors in video sequences by exploiting **dual correspondences** — both intra-frame (spatial) and inter-frame (temporal) — via a Relation Attention module built on a DeepLabV3 encoder backbone. This design lets the model handle frames where intra-frame mirror cues are weak or absent, producing accurate and temporally consistent segmentation masks. | File | Description | |------|-------------| | `best.pth` | Best checkpoint (714 MB), saved as `{'model': state_dict, ...}` | | `results/results.zip` | VMD-Net predictions on the VMD-D test set | | `results/baseline_results.zip` | Baseline method predictions for comparison | ## Loading the Weights ```python import torch from networks.VMD_network import VMD_Network # from the code release model = VMD_Network() checkpoint = torch.load("best.pth", map_location="cpu") model.load_state_dict(checkpoint["model"]) model.eval() ``` Download the checkpoint: ```bash huggingface-cli download garrying/VMD-Net best.pth --local-dir ./weights ``` ## Training Dataset This model was trained and evaluated on **VMD-D**, the first large-scale video mirror detection dataset: - 14,987 frames from 269 videos with manually annotated binary masks - Available at [garrying/VMD-D](https://huggingface.co/datasets/garrying/VMD-D) ## Citation ```bibtex @InProceedings{Lin_2023_CVPR, author = {Lin, Jiaying and Tan, Xin and Lau, Rynson W.H.}, title = {Learning To Detect Mirrors From Videos via Dual Correspondences}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023}, pages = {9109-9118} } ``` ## License Non-commercial use only — [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).