Test_Voice / README.md
ducnguyen1978's picture
Update README.md
9a1bae8 verified

A newer version of the Gradio SDK is available: 6.6.0

Upgrade
metadata
title: Translation AI Agent
emoji: 🌍
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.43.1
app_file: app.py
pinned: false
license: mit

🌍 Translation AI Agent

Real-time multilingual speech-to-speech and text translation AI agent powered by state-of-the-art HuggingFace models.

🎯 Features

  • πŸ“ Text Translation: Translate between 15+ languages instantly
  • 🎡 Speech-to-Text: Convert audio to text with Whisper accuracy
  • πŸ”Š Text-to-Speech: Generate natural speech from translated text
  • πŸ”„ Speech-to-Speech: Complete audio translation pipeline
  • πŸ”΄ Live Translation: Real-time microphone translation

πŸ€– AI Models

  • Translation: facebook/nllb-200-distilled-600M (Meta NLLB)
  • Speech Recognition: openai/whisper-base (OpenAI Whisper)
  • Text-to-Speech: microsoft/speecht5_tts (Microsoft SpeechT5)

🌐 Supported Languages

English, Spanish, French, German, Italian, Portuguese, Russian, Japanese, Korean, Chinese, Arabic, Hindi, Vietnamese, Thai, Turkish

πŸš€ Usage

  1. Text Translation: Enter text and select languages
  2. Audio Translation: Upload audio file for translation
  3. Live Translation: Use microphone for real-time translation

πŸ“Š Performance

  • Translation Quality: BLEU score 25-35
  • Speech Recognition: WER < 10% for clear audio
  • Latency: < 2 seconds end-to-end (GPU)

Built with ❀️ using HuggingFace Transformers, Gradio, and PyTorch.