Spaces:
Sleeping
Sleeping
File size: 648 Bytes
d8a1edf | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | ---
title: Image to Voice
emoji: 🎤
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
pinned: false
---
# Image to Voice Converter
This Space converts images to text using Hugging Face's image-to-text pipeline, then converts the text to speech using Supertonic TTS.
## How it works
1. Upload an image
2. The model extracts text from the image
3. The text is converted to speech using a text-to-speech model
4. Listen to the generated audio!
## Technologies Used
- **Hugging Face Transformers**: For image-to-text conversion
- **Supertonic TTS**: For text-to-speech synthesis
- **Gradio**: For the web interface |