Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.5.1
metadata
title: Image to Voice
emoji: 🎤
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
pinned: false
Image to Voice Converter
This Space converts images to text using Hugging Face's image-to-text pipeline, then converts the text to speech using Supertonic TTS.
How it works
- Upload an image
- The model extracts text from the image
- The text is converted to speech using a text-to-speech model
- Listen to the generated audio!
Technologies Used
- Hugging Face Transformers: For image-to-text conversion
- Supertonic TTS: For text-to-speech synthesis
- Gradio: For the web interface