import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("Revanthraja/Text_to_Vision", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]Text-to-Video Model with Hugging Face Transformers
This repository contains a text-to-video generation model fine-tuned using the Hugging Face Transformers library. The model has been trained on various datasets over approximately 1000 steps to generate video content from textual input.
Overview
The text-to-video model developed here is based on Hugging Face's Transformers, specializing in translating textual descriptions into corresponding video sequences. It has been fine-tuned on diverse datasets, enabling it to understand and visualize a wide range of textual prompts, generating relevant video content.
Features
- Transforms text input into corresponding video sequences
- Fine-tuned using Hugging Face Transformers with datasets spanning various domains
- Capable of generating diverse video content based on textual descriptions
- Handles nuanced textual prompts to generate meaningful video representations
- Downloads last month
- 10