ImageToSpeechTest / README.md
jonloporto's picture
Update README.md
d8a1edf verified

A newer version of the Gradio SDK is available: 6.5.1

Upgrade
metadata
title: Image to Voice
emoji: 🎤
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
pinned: false

Image to Voice Converter

This Space converts images to text using Hugging Face's image-to-text pipeline, then converts the text to speech using Supertonic TTS.

How it works

  1. Upload an image
  2. The model extracts text from the image
  3. The text is converted to speech using a text-to-speech model
  4. Listen to the generated audio!

Technologies Used

  • Hugging Face Transformers: For image-to-text conversion
  • Supertonic TTS: For text-to-speech synthesis
  • Gradio: For the web interface