Text-to-Video
Diffusers
ONNX
English
How to use from the
Use from the
Diffusers library
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline

# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("BestWishYsh/OpenS2V-Weight", dtype=torch.bfloat16, device_map="cuda")

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]

OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation

If you like our project, please give us a star ⭐ on GitHub for the latest update.

✨ Summary

  1. New S2V Benchmark.
    • We introduce OpenS2V-Eval for comprehensive evaluation of S2V models and propose three new automatic metrics aligned with human perception.
  2. New Insights for S2V Model Selection.
    • Our evaluations using OpenS2V-Eval provide crucial insights into the strengths and weaknesses of various subject-to-video generation models.
  3. Million-Scale S2V Dataset.
    • We create OpenS2V-5M, a dataset with 5.1M high-quality regular data and 0.35M Nexus Data, the latter is expected to address the three core challenges of subject-to-video.

πŸ’‘ Description

✏️ Citation

If you find our paper and code useful in your research, please consider giving a star and citation.

@article{yuan2025opens2v,
  title={OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation},
  author={Yuan, Shenghai and He, Xianyi and Deng, Yufan and Ye, Yang and Huang, Jinfa and Lin, Bin and Luo, Jiebo and Yuan, Li},
  journal={arXiv preprint arXiv:2505.20292},
  year={2025}
}
Downloads last month
23
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 1 Ask for provider support

Model tree for BestWishYsh/OpenS2V-Weight

Quantized
(9)
this model

Datasets used to train BestWishYsh/OpenS2V-Weight

Collection including BestWishYsh/OpenS2V-Weight

Paper for BestWishYsh/OpenS2V-Weight