Jhon-Doe-1.0

Jhon-Doe-1.0 is a finetuned version of LongCat-Video for text-to-video generation.

Model Description

Base Model: meituan-longcat/LongCat-Video
Training Framework: SimpleTuner
Hardware: 4x NVIDIA B200 GPUs
Training Data: [To be described]
Training Steps: 2500

Usage

import torch
from diffusers import DiffusionPipeline

# Load the model
pipe = DiffusionPipeline.from_pretrained(
    "tylerxdurden/Jhon-Doe-1.0",
    torch_dtype=torch.bfloat16
)
pipe.to("cuda")

# Generate video
prompt = "A cinematic video of a sunset over the ocean"
video = pipe(
    prompt=prompt,
    num_frames=49,
    guidance_scale=7.5,
    num_inference_steps=50
).frames[0]

# Save video
from diffusers.utils import export_to_video
export_to_video(video, "output.mp4", fps=24)

Training Details

Training Configuration

Batch Size: 1
Gradient Accumulation: 4
Effective Batch Size: 16 (4 GPUs x 4 grad accum)
Learning Rate: 1e-5
Optimizer: AdamW BF16
Mixed Precision: BF16
Gradient Checkpointing: Enabled

Dataset

[To be described - number of videos, duration, content, etc.]

Limitations

[To be described]

License

This model is released under the Apache 2.0 license, following the base model's license.

Citation

@misc{jhon-doe-2024,
  title={Jhon-Doe-1.0: A Finetuned LongCat-Video Model},
  author={tylerxdurden},
  year={2026},
  url={https://huggingface.co/tylerxdurden/Jhon-Doe-1.0}
}

Acknowledgments

LongCat-Video by Meituan
SimpleTuner training framework

Downloads last month: 25

Model tree for tylerxdurden/Jhon-Doe-1.0

Base model

meituan-longcat/LongCat-Video

Finetuned

(2)

this model