FastVLM WebGPU
🍎
446
Real-time video captioning powered by FastVLM
Wan: Open and Advanced Large-Scale Video Generative Models
The ultimate guide to training LLM on large GPU Clusters
Predict click location on a UI screenshot
Generate speech from text using a reference voice
Audio Conditioned LipSync with Latent Diffusion Models