HyDRA / README.md
nielsr's picture
nielsr HF Staff
Add model card for HyDRA
4e4ac4c verified
|
raw
history blame
1.97 kB
metadata
license: apache-2.0
pipeline_tag: text-to-video

HyDRA: Out of Sight but Not Out of Mind

This repository contains the weights for HyDRA (Hybrid Memory for Dynamic Video World Models), as presented in the paper Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models.

HyDRA is a novel memory architecture that enables video world models to simultaneously act as precise archivists for static backgrounds and vigilant trackers for dynamic subjects. This ensures visual and motion continuity even when subjects temporarily move out of the camera's field of view.

Project Page | GitHub

Installation

To set up the environment, clone the repository and install the dependencies:

git clone https://github.com/H-EmbodVis/HyDRA.git
cd HyDRA
conda create -n hydra python=3.10 -y
conda activate hydra
pip install -r requirements.txt

Inference

HyDRA is built upon the Wan2.1 (1.3B) T2V model. Ensure you have downloaded the required weights into the ./ckpts directory as described in the official repository.

You can run inference on example data using the following command:

python infer_hydra.py

Training

The model can be trained on custom datasets using the provided training script:

python train_hydra.py \
  --dit_path ./ckpts/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors \
  --use_gradient_checkpointing \
  --hydra 

Citation

If you find this work useful, please consider citing:

@article{chen2026out,
  title   = {Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models},
  author  = {Chen, Kaijin and Liang, Dingkang and Zhou, Xin and Ding, Yikang and Liu, Xiaoqiang and Wan, Pengfei and Bai, Xiang},
  journal = {arXiv preprint arXiv:2603.25716},
  year    = {2026}
}