YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Steady-Forcing: Balancing Spatial Persistence and Motion Continuity in Long-Horizon Nature Video Diffusion
โ๏ธ Authors
Matiur Rahman Minar1, Seunghun Oh2, Ganghyeon Jeong2, Unsang Park1,2
1Department of Computer Science and Engineering, Sogang University 2Department of Artificial Intelligence, Sogang University
๐ Progress
- ๐ Technical Report / Paper
- ๐ Project Homepage
- ๐ป Training & Inference Code
- ๐ค Pretrained Model: T2V-1.3B
๐ฏ Overview
Steady-Forcing produces long-horizon nature video rollouts from a fixed-camera view. It decouples spatial persistence from motion continuity via a structural dual-memory protocol. This enables stable backgrounds and sustained fluid motion.
TL;DR: We propose a dual-memory framework that balances stability and motion to sustain high background persistence and continuous fluid dynamics over multi-minute horizons for fixed-camera nature video generation.
๐ฌ Demo
https://minar09.github.io/steadyforcing/
๐ง Requirements
| Requirement | Specification |
|---|---|
| GPU | NVIDIA GPU with โฅ 24 GB VRAM (tested on A100 80 GB) |
| OS | Linux |
| Python | 3.10 |
| Training Setup | 8 ร A100 GPUs (for full training run) |
Other hardware configurations may work but have not been tested.
๐ ๏ธ Installation
Clone the repository and set up the environment in one step:
git clone https://github.com/minar09/steady-forcing.git
cd steady-forcing
bash setup_env.sh
This script creates a Python 3.10 environment, installs all dependencies from requirements.txt, and downloads the required base models.
Alternatively, a Dockerfile is provided for containerized setups:
docker build -t steady-forcing .
๐ฆ Pretrained Checkpoints
Download
huggingface-cli download minar09/Steady-Forcing-T2V-1.3B --local-dir ./ckpt
Or using the Python API:
from huggingface_hub import snapshot_download
snapshot_download(repo_id="minar09/Steady-Forcing-T2V-1.3B", local_dir="./ckpt")
Note: The training algorithm is data-free distillation โ no video dataset is required.
File Structure
After downloading, organize your working directory as follows:
steady-forcing/
โโโ prompts/ # Example text prompts
โโโ configs/ # Model and training configs
โโโ pipeline/ # Inference pipeline
โโโ trainer/ # Training modules
โโโ demo_utils/ # Demo helper utilities
โโโ scripts/ # Utility scripts
โโโ templates/ # Prompt templates
โโโ ckpt/
โ โโโ steady-forcing-t2v.pt # Main model checkpoint
โโโ inference.py
โโโ inference.sh
โโโ train.py
โโโ train.sh
โโโ demo.py
โโโ setup.py
๐ Inference
Quick Start
bash inference.sh
Custom Prompt Inference
from pipeline import SteadyForcingPipeline
pipe = SteadyForcingPipeline.from_pretrained("minar09/Steady-Forcing-T2V-1.3B")
prompt = """A serene woodland stream scene recorded by a completely fixed, static,
tripod mounted camera. A narrow stream of clear water flows continuously from the
upper part of the frame toward the lower edge, winding gently between moss-covered
rocks and grassy banks. [60s]"""
negative_prompt = """solid water, metallic water, water stagnation, color drift,
frozen motion, camera movement, zooming, panning, visual artifacts, unnatural water,
unnatural waves, unnatural flow, unnatural motion, human, animal"""
video = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
num_frames=480, # 60s at 8fps
).frames
video[0].save("output.mp4")
Recommended Negative Prompt
For best results, always include this negative prompt to suppress common failure modes:
่ฒ่ฐ่ณไธฝ๏ผ่ฟๆ๏ผ็ป่ๆจก็ณไธๆธ
๏ผๅญๅน๏ผ้ฃๆ ผ๏ผไฝๅ๏ผ็ปไฝ๏ผ็ป้ข๏ผ้ๆญข๏ผๆดไฝๅ็ฐ๏ผๆๅทฎ่ดจ้๏ผไฝ่ดจ้๏ผ
JPEGๅ็ผฉๆฎ็๏ผไธ้็๏ผๆฎ็ผบ็๏ผๅคไฝ็ๆๆ๏ผ็ปๅพไธๅฅฝ็ๆ้จ๏ผ็ปๅพไธๅฅฝ็่ธ้จ๏ผ็ธๅฝข็๏ผๆฏๅฎน็๏ผ
ๅฝขๆ็ธๅฝข็่ขไฝ๏ผๆๆ่ๅ๏ผ้ๆญขไธๅจ็็ป้ข๏ผๆไนฑ็่ๆฏ๏ผไธๆก่
ฟ๏ผ่ๆฏไบบๅพๅค๏ผๅ็่ตฐ,
solid water, metallic water, water stagnation, color drift, water flow drift,
water color drift, water surface drift, scene drift, background drift,
frozen motion, camera movement, zooming, panning, camera drift,
visual artifacts, camera, tripod, ground artifacts, anomalous textures,
unrealistic round shaped pattern, localized distortions, unnatural water,
unnatural waves, unnatural flow, unnatural motion, unnatural physics,
unnatural dynamics, unnatural fluidity, unnatural surface, unnatural reflections,
unnatural refractions, unnatural transparency, unnatural opacity,
unnatural viscosity, unnatural turbulence, unnatural splashes, unnatural ripples,
unnatural foam, unnatural spray, unnatural mist, unnatural droplets,
human, animal, repetitive round textures, pond effects, low dynamic degree,
unnatural color, unnatural lighting, unnatural shadows, unnatural highlights,
unnatural contrast, unnatural saturation, unnatural hue, unnatural brightness,
unnatural darkness, unnatural exposure, unnatural noise, unnatural grain,
unnatural blur, unnatural sharpness, unnatural clarity, unnatural detail,
unnatural texture, unnatural pattern, unnatural composition
๐๏ธ Training
Self-Forcing Training with DMD
bash train.sh
Training was completed in under 67 hours on 8 ร A100 GPUs. No video dataset is required โ the method uses data-free ODE distillation.
Running the Demo
python demo.py
Testing Model Loading
python test_loading.py
๐ Trigger Words
Include these keywords in your prompts to activate model-specific conditioning:
| Trigger Word | Purpose |
|---|---|
Steady-Forcing: Balancing |
Core method conditioning |
Spatial Persistence |
Static background anchoring |
Motion Continuity |
Sustained fluid dynamics |
Long-Horizon |
Extended duration generation |
Nature Video Diffusion |
Natural scene domain |
drift-stagnation |
Suppresses visual drift artifacts |
nature-flow |
Natural fluid flow conditioning |
fixed-camera |
Fixed tripod camera constraint |
static nature |
Static environmental anchoring |
๐ Results
Quantitative and qualitative results are available in the arXiv preprint. For visualizations and video comparisons, visit the project page.
๐ Citation
If you use this model or codebase in your research, please cite:
@article{minar2025steady,
title={Steady-Forcing: Balancing Spatial Persistence and Motion Continuity
in Long-Horizon Nature Video Diffusion},
author={Minar, Matiur Rahman and Oh, Seunghun and Jeong, Ganghyeon and Park, Unsang},
journal={arXiv preprint arXiv:2606.7661673},
year={2026}
}
๐ค Acknowledgements
This project builds on the open-source Infinity-RoPE and Reward-Forcing implementations, and acknowledges related work in long-horizon video diffusion, motion continuity, and spatial persistence. We sincerely appreciate their efforts and thank them.