Spaces:

ducnguyen1978
/

Test_Voice

Runtime error

Test_Voice / README.md

Update README.md

9a1bae8 verified 6 months ago

1.48 kB

	---
	title: Translation AI Agent
	emoji: 🌍
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 5.43.1
	app_file: app.py
	pinned: false
	license: mit
	---

	# 🌍 Translation AI Agent

	Real-time multilingual speech-to-speech and text translation AI agent powered by state-of-the-art HuggingFace models.

	## 🎯 Features

	- 📝 Text Translation: Translate between 15+ languages instantly
	- 🎵 Speech-to-Text: Convert audio to text with Whisper accuracy
	- 🔊 Text-to-Speech: Generate natural speech from translated text
	- 🔄 Speech-to-Speech: Complete audio translation pipeline
	- 🔴 Live Translation: Real-time microphone translation

	## 🤖 AI Models

	- Translation: `facebook/nllb-200-distilled-600M` (Meta NLLB)
	- Speech Recognition: `openai/whisper-base` (OpenAI Whisper)
	- Text-to-Speech: `microsoft/speecht5_tts` (Microsoft SpeechT5)

	## 🌐 Supported Languages

	English, Spanish, French, German, Italian, Portuguese, Russian, Japanese, Korean, Chinese, Arabic, Hindi, Vietnamese, Thai, Turkish

	## 🚀 Usage

	1. Text Translation: Enter text and select languages
	2. Audio Translation: Upload audio file for translation
	3. Live Translation: Use microphone for real-time translation

	## 📊 Performance

	- Translation Quality: BLEU score 25-35
	- Speech Recognition: WER < 10% for clear audio
	- Latency: < 2 seconds end-to-end (GPU)

	Built with ❤️ using HuggingFace Transformers, Gradio, and PyTorch.