Jhon-Doe-1.0
Jhon-Doe-1.0 is a finetuned version of LongCat-Video for text-to-video generation.
Model Description
- Base Model: meituan-longcat/LongCat-Video
- Training Framework: SimpleTuner
- Hardware: 4x NVIDIA B200 GPUs
- Training Data: [To be described]
- Training Steps: 2500
Usage
import torch
from diffusers import DiffusionPipeline
# Load the model
pipe = DiffusionPipeline.from_pretrained(
"tylerxdurden/Jhon-Doe-1.0",
torch_dtype=torch.bfloat16
)
pipe.to("cuda")
# Generate video
prompt = "A cinematic video of a sunset over the ocean"
video = pipe(
prompt=prompt,
num_frames=49,
guidance_scale=7.5,
num_inference_steps=50
).frames[0]
# Save video
from diffusers.utils import export_to_video
export_to_video(video, "output.mp4", fps=24)
Training Details
Training Configuration
- Batch Size: 1
- Gradient Accumulation: 4
- Effective Batch Size: 16 (4 GPUs x 4 grad accum)
- Learning Rate: 1e-5
- Optimizer: AdamW BF16
- Mixed Precision: BF16
- Gradient Checkpointing: Enabled
Dataset
[To be described - number of videos, duration, content, etc.]
Limitations
[To be described]
License
This model is released under the Apache 2.0 license, following the base model's license.
Citation
@misc{jhon-doe-2024,
title={Jhon-Doe-1.0: A Finetuned LongCat-Video Model},
author={tylerxdurden},
year={2026},
url={https://huggingface.co/tylerxdurden/Jhon-Doe-1.0}
}
Acknowledgments
- LongCat-Video by Meituan
- SimpleTuner training framework
- Downloads last month
- 25
Model tree for tylerxdurden/Jhon-Doe-1.0
Base model
meituan-longcat/LongCat-Video