Jhon-Doe-1.0

Jhon-Doe-1.0 is a finetuned version of LongCat-Video for text-to-video generation.

Model Description

  • Base Model: meituan-longcat/LongCat-Video
  • Training Framework: SimpleTuner
  • Hardware: 4x NVIDIA B200 GPUs
  • Training Data: [To be described]
  • Training Steps: 2500

Usage

import torch
from diffusers import DiffusionPipeline

# Load the model
pipe = DiffusionPipeline.from_pretrained(
    "tylerxdurden/Jhon-Doe-1.0",
    torch_dtype=torch.bfloat16
)
pipe.to("cuda")

# Generate video
prompt = "A cinematic video of a sunset over the ocean"
video = pipe(
    prompt=prompt,
    num_frames=49,
    guidance_scale=7.5,
    num_inference_steps=50
).frames[0]

# Save video
from diffusers.utils import export_to_video
export_to_video(video, "output.mp4", fps=24)

Training Details

Training Configuration

  • Batch Size: 1
  • Gradient Accumulation: 4
  • Effective Batch Size: 16 (4 GPUs x 4 grad accum)
  • Learning Rate: 1e-5
  • Optimizer: AdamW BF16
  • Mixed Precision: BF16
  • Gradient Checkpointing: Enabled

Dataset

[To be described - number of videos, duration, content, etc.]

Limitations

[To be described]

License

This model is released under the Apache 2.0 license, following the base model's license.

Citation

@misc{jhon-doe-2024,
  title={Jhon-Doe-1.0: A Finetuned LongCat-Video Model},
  author={tylerxdurden},
  year={2026},
  url={https://huggingface.co/tylerxdurden/Jhon-Doe-1.0}
}

Acknowledgments

Downloads last month
25
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tylerxdurden/Jhon-Doe-1.0

Finetuned
(2)
this model