metadata
license: mit
pipeline_tag: text-to-video
library_name: transformers
VideoPhy: Evaluating Physical Commonsense in Video Generation
This text-to-video model is part of the VideoPhy project, which benchmarks physical commonsense in video generation. It generates videos from text prompts, aiming to evaluate how well generated videos adhere to real-world physics.
Project Website | Paper | GitHub
For detailed use-case instructions, please refer to the project's GitHub repository.