DINO-Foresight / README.md
Sta8is's picture
Improve model card: Add pipeline tag, library_name, paper, and project page links (#1)
9174a31 verified
---
language:
- en
license: mit
metrics:
- mean_iou
- AbsRel
- AP50
tags:
- Semantic_Future_Prediction
- World Models
pipeline_tag: image-feature-extraction
library_name: pytorch
---
# DINO-Foresight: Looking into the Future with DINO (NeurIPS 2025)
![image](https://cdn-uploads.huggingface.co/production/uploads/677272184d148b904333e874/btsV1EHK4yYiyboMjG7oj.png)
DINO-Foresight is a novel framework that operates in the semantic feature space of pretrained Vision Foundation Models (VFMs) to predict future dynamics. It trains a masked feature transformer in a self-supervised manner to forecast the evolution of VFM features over time, enabling various scene understanding tasks through off-the-shelf, task-specific heads.
## Paper
[DINO-Foresight: Looking into the Future with DINO](https://huggingface.co/papers/2412.11673)
## Project Page
[https://dino-foresight.github.io](https://dino-foresight.github.io)
## Code
The official implementation can be found on GitHub: [https://github.com/Sta8is/DINO-Foresight](https://github.com/Sta8is/DINO-Foresight)
## Sample Usage
The model is built with PyTorch. You can set up the environment and install dependencies as follows:
```bash
conda create -n dinof python=3.11
conda activate dinof
pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu121
git clone https://github.com/Sta8is/DINO-Foresight
cd DINO-Foresight
pip install -r requirements.txt
```
For detailed usage examples and model training, refer to the provided demos and the GitHub repository.
## Demo
We provide 2 quick demos.
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1K8o3-zkR5lCdAv7Qjz7rhWP8n2VrFOdz?usp=sharing)
- [Demo](https://github.com/Sta8is/DINO-Foresight/blob/main/demo.ipynb).
## Citation
If you found DINO-Foresight useful in your research, please consider starring ⭐ us on GitHub and citing 📚 us in your research!
```bibtex
@inproceedings{karypidis2025dinoforesight,
title={{DINO}-Foresight: Looking into the Future with {DINO}},
author={Efstathios Karypidis and Ioannis Kakogeorgiou and Spyros Gidaris and Nikos Komodakis},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://arxiv.org/abs/2412.11673}
}
```
## Acknowledgements
Our code is partially based on [Maskgit-pytorch](https://github.com/valeoai/Maskgit-pytorch), a pytorch implementation of MaskedGit by ValeoAI. We also thank authors of [DINOv2](https://github.com/facebookresearch/dinov2), [DPT](https://github.com/isl-org/DPT), [DepthAnythingV2](https://github.com/DepthAnything/Depth-Anything-V2), [LOTUS](https://github.com/EnVision-Research/Lotus) for their work and open-source code.