--- language: - en license: mit metrics: - mean_iou - AbsRel - AP50 tags: - Semantic_Future_Prediction - World Models pipeline_tag: image-feature-extraction library_name: pytorch --- # DINO-Foresight: Looking into the Future with DINO (NeurIPS 2025) ![image](https://cdn-uploads.huggingface.co/production/uploads/677272184d148b904333e874/btsV1EHK4yYiyboMjG7oj.png) DINO-Foresight is a novel framework that operates in the semantic feature space of pretrained Vision Foundation Models (VFMs) to predict future dynamics. It trains a masked feature transformer in a self-supervised manner to forecast the evolution of VFM features over time, enabling various scene understanding tasks through off-the-shelf, task-specific heads. ## Paper [DINO-Foresight: Looking into the Future with DINO](https://huggingface.co/papers/2412.11673) ## Project Page [https://dino-foresight.github.io](https://dino-foresight.github.io) ## Code The official implementation can be found on GitHub: [https://github.com/Sta8is/DINO-Foresight](https://github.com/Sta8is/DINO-Foresight) ## Sample Usage The model is built with PyTorch. You can set up the environment and install dependencies as follows: ```bash conda create -n dinof python=3.11 conda activate dinof pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu121 git clone https://github.com/Sta8is/DINO-Foresight cd DINO-Foresight pip install -r requirements.txt ``` For detailed usage examples and model training, refer to the provided demos and the GitHub repository. ## Demo We provide 2 quick demos. - [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1K8o3-zkR5lCdAv7Qjz7rhWP8n2VrFOdz?usp=sharing) - [Demo](https://github.com/Sta8is/DINO-Foresight/blob/main/demo.ipynb). ## Citation If you found DINO-Foresight useful in your research, please consider starring ⭐ us on GitHub and citing 📚 us in your research! ```bibtex @inproceedings{karypidis2025dinoforesight, title={{DINO}-Foresight: Looking into the Future with {DINO}}, author={Efstathios Karypidis and Ioannis Kakogeorgiou and Spyros Gidaris and Nikos Komodakis}, booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems}, year={2025}, url={https://arxiv.org/abs/2412.11673} } ``` ## Acknowledgements Our code is partially based on [Maskgit-pytorch](https://github.com/valeoai/Maskgit-pytorch), a pytorch implementation of MaskedGit by ValeoAI. We also thank authors of [DINOv2](https://github.com/facebookresearch/dinov2), [DPT](https://github.com/isl-org/DPT), [DepthAnythingV2](https://github.com/DepthAnything/Depth-Anything-V2), [LOTUS](https://github.com/EnVision-Research/Lotus) for their work and open-source code.