HierSpeech++ (Zero-shot TTS)
Generate high-quality speech from text using a prompt audio
Generate high-quality speech from text using a prompt audio
Generate detailed prompts from any image
Translate speech and text between languages
Compare two faces to verify identity
Generate speech in a cloned voice from a short audio clip
Transcribe and translate audio into text
Replace objects in images using prompts or reference images
Combine voice cloning and portrait lipsync animation
Generate live visual descriptions from your camera
Create your own AI comic with a single prompt
Generate text using the Phi language model
In-browser background removal
Generates audio environment from an image
Enhance images with custom text instructions
Get a music sample inspired by the mood of an image
Detect objects in images or videos
Transcribe audio files with timestamps and downloadable subtitles