| | --- |
| | language: |
| | - en |
| | license: mit |
| | metrics: |
| | - mean_iou |
| | - AbsRel |
| | - AP50 |
| | tags: |
| | - Semantic_Future_Prediction |
| | - World Models |
| | pipeline_tag: image-feature-extraction |
| | library_name: pytorch |
| | --- |
| | |
| | # DINO-Foresight: Looking into the Future with DINO (NeurIPS 2025) |
| |
|
| |  |
| |
|
| | DINO-Foresight is a novel framework that operates in the semantic feature space of pretrained Vision Foundation Models (VFMs) to predict future dynamics. It trains a masked feature transformer in a self-supervised manner to forecast the evolution of VFM features over time, enabling various scene understanding tasks through off-the-shelf, task-specific heads. |
| |
|
| | ## Paper |
| | [DINO-Foresight: Looking into the Future with DINO](https://huggingface.co/papers/2412.11673) |
| |
|
| | ## Project Page |
| | [https://dino-foresight.github.io](https://dino-foresight.github.io) |
| |
|
| | ## Code |
| | The official implementation can be found on GitHub: [https://github.com/Sta8is/DINO-Foresight](https://github.com/Sta8is/DINO-Foresight) |
| |
|
| | ## Sample Usage |
| |
|
| | The model is built with PyTorch. You can set up the environment and install dependencies as follows: |
| |
|
| | ```bash |
| | conda create -n dinof python=3.11 |
| | conda activate dinof |
| | pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu121 |
| | git clone https://github.com/Sta8is/DINO-Foresight |
| | cd DINO-Foresight |
| | pip install -r requirements.txt |
| | ``` |
| |
|
| | For detailed usage examples and model training, refer to the provided demos and the GitHub repository. |
| |
|
| | ## Demo |
| | We provide 2 quick demos. |
| |
|
| | - [](https://colab.research.google.com/drive/1K8o3-zkR5lCdAv7Qjz7rhWP8n2VrFOdz?usp=sharing) |
| | - [Demo](https://github.com/Sta8is/DINO-Foresight/blob/main/demo.ipynb). |
| |
|
| | ## Citation |
| | If you found DINO-Foresight useful in your research, please consider starring ⭐ us on GitHub and citing 📚 us in your research! |
| | ```bibtex |
| | @inproceedings{karypidis2025dinoforesight, |
| | title={{DINO}-Foresight: Looking into the Future with {DINO}}, |
| | author={Efstathios Karypidis and Ioannis Kakogeorgiou and Spyros Gidaris and Nikos Komodakis}, |
| | booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems}, |
| | year={2025}, |
| | url={https://arxiv.org/abs/2412.11673} |
| | } |
| | ``` |
| |
|
| | ## Acknowledgements |
| | Our code is partially based on [Maskgit-pytorch](https://github.com/valeoai/Maskgit-pytorch), a pytorch implementation of MaskedGit by ValeoAI. We also thank authors of [DINOv2](https://github.com/facebookresearch/dinov2), [DPT](https://github.com/isl-org/DPT), [DepthAnythingV2](https://github.com/DepthAnything/Depth-Anything-V2), [LOTUS](https://github.com/EnVision-Research/Lotus) for their work and open-source code. |