Spaces:

ducnguyen1978
/

Voice_Agent

Running

App Files Files Community

ducnguyen1978 commited on Aug 20, 2025

Commit

3639861

verified ·

1 Parent(s): 9c98b62

Upload 2 files

Browse files

Files changed (2) hide show

IFRAME_GUIDE.md +119 -0
app.py +1626 -0

IFRAME_GUIDE.md ADDED Viewed

	@@ -0,0 +1,119 @@

+# 🎤 Voice Studio - iFrame Integration Guide
+## Vấn đề Microphone trong iFrame
+Khi embed ứng dụng Voice Studio vào iframe, microphone có thể không hoạt động do chính sách bảo mật của trình duyệt.
+## ✅ Giải pháp cho Website
+### 1. Thêm Permissions Policy vào iframe tag:
+```html
+<iframe
+    src="http://localhost:7864"
+    width="100%"
+    height="800"
+    allow="microphone *; camera *; display-capture *; autoplay *"
+    permissions-policy="microphone=*, camera=*, display-capture=*, autoplay=*"
+    sandbox="allow-same-origin allow-scripts allow-forms allow-popups allow-modals allow-presentation"
+    frameborder="0">
+</iframe>
+```
+### 2. Cấu hình HTTPS (Khuyến nghị):
+Microphone chỉ hoạt động tốt trên HTTPS. Nếu có thể, hãy deploy ứng dụng qua HTTPS.
+### 3. Alternative HTML cho iframe:
+```html
+<!DOCTYPE html>
+<html>
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Voice Studio Demo</title>
+</head>
+<body style="margin: 0; padding: 0; overflow: hidden;">
+    <iframe
+        id="voice-studio"
+        src="http://localhost:7864"
+        width="100%"
+        height="100vh"
+        allow="microphone *; camera *; display-capture *; autoplay *"
+        permissions-policy="microphone=*, camera=*, display-capture=*, autoplay=*"
+        sandbox="allow-same-origin allow-scripts allow-forms allow-popups allow-modals"
+        frameborder="0"
+        style="border: none;">
+    </iframe>
+    <script>
+    // Handle microphone permission fallback
+    window.addEventListener('message', function(event) {
+        if (event.data.type === 'microphone_error') {
+            const fallbackDiv = document.createElement('div');
+            fallbackDiv.innerHTML = `
+                <div style="
+                    position: fixed;
+                    top: 20px;
+                    right: 20px;
+                    background: #ff6b6b;
+                    color: white;
+                    padding: 15px;
+                    border-radius: 8px;
+                    z-index: 9999;
+                    box-shadow: 0 4px 12px rgba(0,0,0,0.2);
+                ">
+                    <strong>🎤 Microphone Blocked</strong><br>
+                    <button onclick="window.open('${event.data.url}', '_blank')"
+                            style="margin-top: 10px; padding: 8px 16px; background: white; color: #ff6b6b; border: none; border-radius: 4px; cursor: pointer;">
+                        Open in New Window
+                    </button>
+                </div>
+            `;
+            document.body.appendChild(fallbackDiv);
+        }
+    });
+    </script>
+</body>
+</html>
+```
+## 🔧 Server Configuration
+Nếu bạn host ứng dụng trên server riêng, thêm headers sau:
+```python
+# Trong Flask/Django
+response.headers['Permissions-Policy'] = 'microphone=*, camera=*, display-capture=*'
+response.headers['Feature-Policy'] = 'microphone \'self\' *; camera \'self\' *'
+response.headers['X-Frame-Options'] = 'ALLOWALL'
+# Hoặc trong nginx.conf
+add_header Permissions-Policy "microphone=*, camera=*, display-capture=*";
+add_header Feature-Policy "microphone 'self' *; camera 'self' *";
+add_header X-Frame-Options "ALLOWALL";
+```
+## 📱 Mobile Considerations
+Trên mobile, microphone trong iframe có thể bị hạn chế nhiều hơn. Khuyến nghị:
+1. Hiển thị nút "Open in New Window" rõ ràng
+2. Detect mobile và hiển thị thông báo phù hợp
+3. Fallback to file upload nếu microphone không hoạt động
+## 🚀 Best Practices
+1. **HTTPS Required**: Deploy qua HTTPS để microphone hoạt động tốt nhất
+2. **User Feedback**: Luôn hiển thị thông báo rõ ràng khi microphone bị block
+3. **Fallback Options**: Cung cấp tùy chọn upload file thay thế
+4. **Test Cross-Browser**: Test trên Chrome, Firefox, Safari, Edge
+## 🔍 Debugging
+Mở Developer Console để xem logs:
+- `Running in iframe - requesting microphone permissions`
+- `Microphone access granted/denied`
+Nếu vẫn có vấn đề, users có thể click "🔗 Open in New Window" để sử dụng đầy đủ tính năng.

app.py ADDED Viewed

	@@ -0,0 +1,1626 @@

+#!/usr/bin/env python3
+# -*- coding: utf-8 -*-
+import os
+import sys
+# Set UTF-8 encoding for Windows
+if sys.platform == 'win32':
+    import codecs
+    sys.stdout = codecs.getwriter('utf-8')(sys.stdout.detach())
+    sys.stderr = codecs.getwriter('utf-8')(sys.stderr.detach())
+import numpy as np
+import gradio as gr
+import google.generativeai as genai
+from gtts import gTTS, lang
+import tempfile
+import soundfile as sf
+# Kokoro not used - removed for performance
+import time
+import base64
+import edge_tts
+import asyncio
+import io
+# Librosa not used - removed for performance
+# Try to import python-docx for Word export
+try:
+    from docx import Document
+    DOCX_AVAILABLE = True
+except ImportError:
+    DOCX_AVAILABLE = False
+# Configure Gemini API
+GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
+if GEMINI_API_KEY:
+    genai.configure(api_key=GEMINI_API_KEY)
+# Language configurations for Audio Translation (simplified)
+GTTS_LANGUAGES = lang.tts_langs()
+GTTS_LANGUAGES['ja'] = 'Japanese'
+SUPPORTED_LANGUAGES = sorted(list(GTTS_LANGUAGES.values()))
+# Voice mapping for Edge TTS - defined once for performance
+VOICE_MAP = {
+    "🇻🇳 HoaiMy - Nữ Việt Chuẩn": "vi-VN-HoaiMyNeural",
+    "🇻🇳 NamMinh - Nam Việt Chuẩn": "vi-VN-NamMinhNeural",
+    "🇺🇸 Aria - Nữ Mỹ": "en-US-AriaNeural",
+    "🇺🇸 Guy - Nam Mỹ": "en-US-GuyNeural",
+    "🇬🇧 Sonia - Nữ Anh": "en-GB-SoniaNeural",
+    "🇬🇧 Ryan - Nam Anh": "en-GB-RyanNeural",
+    "🇩🇪 Katja - Deutsche Frau": "de-DE-KatjaNeural",
+    "🇩🇪 Conrad - Deutscher Mann": "de-DE-ConradNeural",
+    "🇫🇷 Denise - Française": "fr-FR-DeniseNeural",
+    "🇫🇷 Henri - Français": "fr-FR-HenriNeural",
+    "🇪🇸 Elvira - Española": "es-ES-ElviraNeural",
+    "🇪🇸 Alvaro - Español": "es-ES-AlvaroNeural",
+    "🇮🇹 Elsa - Italiana": "it-IT-ElsaNeural",
+    "🇮🇹 Diego - Italiano": "it-IT-DiegoNeural",
+    "🇯🇵 Nanami - 日本女性": "ja-JP-NanamiNeural",
+    "🇯🇵 Keita - 日本男性": "ja-JP-KeitaNeural",
+    "🇰🇷 SunHi - 한국 여성": "ko-KR-SunHiNeural",
+    "🇰🇷 BongJin - 한국 남성": "ko-KR-BongJinNeural",
+    "🇨🇳 Xiaoxiao - 中文女声": "zh-CN-XiaoxiaoNeural",
+    "🇨🇳 Yunxi - 中文男声": "zh-CN-YunxiNeural",
+    "🇷🇺 Svetlana - Русская": "ru-RU-SvetlanaNeural",
+    "🇷🇺 Dmitry - Русский": "ru-RU-DmitryNeural",
+    "🇵🇹 Francisca - Portuguesa": "pt-BR-FranciscaNeural",
+    "🇵🇹 Antonio - Português": "pt-BR-AntonioNeural",
+    "🇸🇦 Zariyah - عربية": "ar-SA-ZariyahNeural",
+    "🇸🇦 Hamed - عربي": "ar-SA-HamedNeural"
+}
+def detect_language(text):
+    """Detect language of input text"""
+    if not text.strip():
+        return "unknown"
+    text_lower = text.lower()
+    # Vietnamese detection
+    vietnamese_chars = 'àáạảãâầấậẩẫăằắặẳẵèéẹẻẽêềếệểễìíịỉĩòóọỏõôồốộổỗơờớợởỡùúụủũưừứựửữỳýỵỷỹđ'
+    if any(char in text for char in vietnamese_chars):
+        return "vietnamese"
+    # German detection
+    german_words = ['der', 'die', 'das', 'und', 'ist', 'ich', 'bin', 'haben', 'sein', 'werden']
+    german_chars = 'äöüß'
+    if any(word in text_lower for word in german_words) or any(char in text for char in german_chars):
+        return "german"
+    # English detection
+    english_words = ['the', 'and', 'is', 'are', 'have', 'has', 'will', 'would', 'can', 'could']
+    if any(word in text_lower for word in english_words):
+        return "english"
+    return "english"
+async def generate_speech(text, voice_name, rate):
+    """Generate speech using Edge TTS"""
+    communicate = edge_tts.Communicate(text, voice_name, rate=f"{rate:+.0%}")
+    # Create in-memory buffer
+    audio_buffer = io.BytesIO()
+    async for chunk in communicate.stream():
+        if chunk["type"] == "audio":
+            audio_buffer.write(chunk["data"])
+    audio_buffer.seek(0)
+    return audio_buffer.getvalue()
+def create_text_file(content, file_format="txt", filename_prefix="translated_text"):
+    """
+    Create a downloadable text file from content in TXT or DOCX format
+    """
+    if not content or content.startswith("Lỗi:"):
+        return None
+    try:
+        if file_format.lower() == "docx" and DOCX_AVAILABLE:
+            # Create Word document
+            fd, temp_file_path = tempfile.mkstemp(suffix=".docx", prefix=f"{filename_prefix}_")
+            os.close(fd)
+            doc = Document()
+            doc.add_heading('Nội dung đã dịch', 0)
+            doc.add_paragraph(content)
+            doc.save(temp_file_path)
+            return temp_file_path
+        else:
+            # Create TXT file (default)
+            fd, temp_file_path = tempfile.mkstemp(suffix=".txt", prefix=f"{filename_prefix}_")
+            os.close(fd)
+            with open(temp_file_path, 'w', encoding='utf-8') as f:
+                f.write(content)
+            return temp_file_path
+    except Exception as e:
+        return None
+def create_audio_voice_studio(text, voice_selection, speed):
+    """Voice Studio functionality"""
+    if not text.strip():
+        return "❌ Vui lòng nhập văn bản / Please enter text / Bitte Text eingeben"
+    try:
+        # Use global VOICE_MAP for performance (avoiding recreation on each call)
+        voice_name = VOICE_MAP.get(voice_selection, "vi-VN-HoaiMyNeural")
+        text_limited = text[:1000] if len(text) > 1000 else text
+        # Convert speed (0.5-2.0) to rate percentage (-50% to +100%)
+        rate_percent = (speed - 1.0)
+        # Generate speech using Edge TTS
+        audio_data = asyncio.run(generate_speech(text_limited, voice_name, rate_percent))
+        # Convert to base64
+        audio_base64 = base64.b64encode(audio_data).decode('utf-8')
+        timestamp = int(time.time())
+        filename = f"voice_{voice_name}_{speed}x_{timestamp}.mp3"
+        # Detect language
+        detected_lang = detect_language(text_limited)
+        # Mobile-optimized HTML player
+        html_player = f'''
+            <div style="
+                background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+                border-radius: 20px;
+                padding: 20px;
+                margin: 10px 0;
+                box-shadow: 0 8px 32px rgba(0,0,0,0.2);
+                color: white;
+                text-align: center;
+            ">
+                <div style="margin-bottom: 20px;">
+                    <h3 style="color: #fff; margin: 0 0 15px 0; font-size: 1.3em; text-shadow: 1px 1px 2px rgba(0,0,0,0.3);">
+                        🎵 Âm thanh hoàn thành!
+                    </h3>
+                    <div style="
+                        background: rgba(255,255,255,0.2);
+                        border-radius: 12px;
+                        padding: 12px;
+                        font-size: 0.9em;
+                        line-height: 1.5;
+                        backdrop-filter: blur(10px);
+                    ">
+                        <div><strong>🎭 Giọng:</strong> {voice_selection}</div>
+                        <div><strong>⚡ Tốc độ:</strong> {speed:.1f}x | <strong>🌍 Ngôn ngữ:</strong> {detected_lang.title()}</div>
+                        <div><strong>📝 Độ dài:</strong> {len(text_limited)} ký tự</div>
+                    </div>
+                </div>
+                <audio controls style="
+                    width: 100%;
+                    max-width: 100%;
+                    height: 50px;
+                    margin: 20px 0;
+                    border-radius: 25px;
+                    background: rgba(255,255,255,0.95);
+                    box-shadow: 0 4px 15px rgba(0,0,0,0.2);
+                ">
+                    <source src="data:audio/mpeg;base64,{audio_base64}" type="audio/mpeg">
+                    Trình duyệt không hỗ trợ audio.
+                </audio>
+                <div style="
+                    display: flex;
+                    justify-content: center;
+                    margin-top: 20px;
+                ">
+                    <a href="data:audio/mpeg;base64,{audio_base64}" download="{filename}"
+                       style="
+                           background: linear-gradient(45deg, #28a745, #20c997);
+                           color: white;
+                           padding: 15px 30px;
+                           text-decoration: none;
+                           border-radius: 25px;
+                           font-weight: 700;
+                           font-size: 1.1em;
+                           display: flex;
+                           align-items: center;
+                           justify-content: center;
+                           box-shadow: 0 4px 15px rgba(40,167,69,0.3);
+                           transition: all 0.3s ease;
+                           min-height: 48px;
+                           min-width: 200px;
+                       "
+                       ontouchstart=""
+                       onmouseover="this.style.transform='translateY(-2px)'; this.style.boxShadow='0 6px 20px rgba(40,167,69,0.4)'"
+                       onmouseout="this.style.transform='translateY(0)'; this.style.boxShadow='0 4px 15px rgba(40,167,69,0.3)'">
+                        📥 TẢI XUỐNG MP3
+                    </a>
+                </div>
+            </div>
+            '''
+        return html_player
+    except Exception as e:
+        return f"❌ Error: {str(e)}"
+# Language mapping for voices - defined once for performance
+VOICE_TO_LANGUAGE = {
+    # Vietnamese
+    "🇻🇳 HoaiMy - Nữ Việt Chuẩn": "Vietnamese",
+    "🇻🇳 NamMinh - Nam Việt Chuẩn": "Vietnamese",
+    # English
+    "🇺🇸 Aria - Nữ Mỹ": "English",
+    "🇺🇸 Guy - Nam Mỹ": "English",
+    "🇬🇧 Sonia - Nữ Anh": "English",
+    "🇬🇧 Ryan - Nam Anh": "English",
+    # German
+    "🇩🇪 Katja - Deutsche Frau": "German",
+    "🇩🇪 Conrad - Deutscher Mann": "German",
+    # French
+    "🇫🇷 Denise - Française": "French",
+    "🇫🇷 Henri - Français": "French",
+    # Spanish
+    "🇪🇸 Elvira - Española": "Spanish",
+    "🇪🇸 Alvaro - Español": "Spanish",
+    # Italian
+    "🇮🇹 Elsa - Italiana": "Italian",
+    "🇮🇹 Diego - Italiano": "Italian",
+    # Japanese
+    "🇯🇵 Nanami - 日本女性": "Japanese",
+    "🇯🇵 Keita - 日本男性": "Japanese",
+    # Korean
+    "🇰🇷 SunHi - 한국 여성": "Korean",
+    "🇰🇷 BongJin - 한국 남성": "Korean",
+    # Chinese
+    "🇨🇳 Xiaoxiao - 中文女声": "Chinese",
+    "🇨🇳 Yunxi - 中文男声": "Chinese",
+    # Russian
+    "🇷🇺 Svetlana - Русская": "Russian",
+    "🇷🇺 Dmitry - Русский": "Russian",
+    # Portuguese
+    "🇵🇹 Francisca - Portuguesa": "Portuguese",
+    "🇵🇹 Antonio - Português": "Portuguese",
+    # Arabic
+    "🇸🇦 Zariyah - عربية": "Arabic",
+    "🇸🇦 Hamed - عربي": "Arabic"
+}
+def get_target_language_from_voice(voice_selection):
+    """Map voice selection to target language for translation"""
+    return VOICE_TO_LANGUAGE.get(voice_selection, "Vietnamese")
+def translate_text_with_gemini(text, target_language):
+    """Translate text using Gemini API"""
+    try:
+        if not GEMINI_API_KEY:
+            return f"Lỗi: Cần cấu hình GEMINI_API_KEY"
+        if not text.strip():
+            return ""
+        model = genai.GenerativeModel("gemini-2.0-flash")
+        prompt = f"""Translate the following text to {target_language}. Return ONLY the translated text, nothing else:
+{text}"""
+        response = model.generate_content(prompt)
+        translated_text = response.text.strip()
+        # Clean up any unwanted text that might be included
+        if translated_text.lower().startswith("translation:"):
+            translated_text = translated_text[12:].strip()
+        if translated_text.lower().startswith("here is"):
+            lines = translated_text.split('\n')
+            if len(lines) > 1:
+                translated_text = '\n'.join(lines[1:]).strip()
+        return translated_text
+    except Exception as e:
+        return f"Lỗi dịch thuật: {str(e)}"
+def translate_audio(audio_file, target_country, voice_selection, text_format="txt"):
+    """
+    Transcribe, translate and synthesize audio to target language with Voice Studio integration
+    """
+    try:
+        if not GEMINI_API_KEY:
+            return "Lỗi: Cần cấu hình GEMINI_API_KEY", "Không xác định", "", target_country, None, "", "", None
+        if audio_file is None:
+            return "Lỗi: Vui lòng tải lên file audio", "Không xác định", "", target_country, None, "", "", None
+        # Get target language from voice selection
+        target_language = get_target_language_from_voice(voice_selection)
+        # Transcribe audio using Gemini
+        model = genai.GenerativeModel("gemini-2.0-flash")
+        # Read audio file
+        with open(audio_file, 'rb') as f:
+            audio_data = f.read()
+        # Create audio blob
+        audio_blob = {
+            'mime_type': 'audio/wav',
+            'data': audio_data
+        }
+        # Single API call for transcription and translation (optimized for speed)
+        combined_prompt = f"""You are a professional transcriber and translator. Process this audio in one step:
+1. Transcribe the audio accurately in its original language
+2. Identify the source language
+3. Translate to {target_language} preserving meaning and cultural context
+Format your response exactly as:
+LANGUAGE: [detected language]
+TRANSCRIPT: [original transcription]
+TRANSLATION: [translation to {target_language}]"""
+        response = model.generate_content([combined_prompt, audio_blob])
+        full_response = response.text.strip()
+        # Parse combined response
+        try:
+            if "LANGUAGE:" in full_response and "TRANSCRIPT:" in full_response and "TRANSLATION:" in full_response:
+                lines = full_response.split('\n')
+                detected_lang = ""
+                transcription = ""
+                translated_text = ""
+                for line in lines:
+                    if line.startswith("LANGUAGE:"):
+                        detected_lang = line.replace("LANGUAGE:", "").strip()
+                    elif line.startswith("TRANSCRIPT:"):
+                        transcription = line.replace("TRANSCRIPT:", "").strip()
+                    elif line.startswith("TRANSLATION:"):
+                        translated_text = line.replace("TRANSLATION:", "").strip()
+            else:
+                # Fallback parsing
+                detected_lang = "Không xác định"
+                transcription = full_response.split("TRANSCRIPT:")[-1].split("TRANSLATION:")[0].strip() if "TRANSCRIPT:" in full_response else full_response
+                translated_text = full_response.split("TRANSLATION:")[-1].strip() if "TRANSLATION:" in full_response else transcription
+        except:
+            # Emergency fallback
+            detected_lang = "Không xác định"
+            transcription = full_response
+            translated_text = full_response
+        # Generate audio using Edge TTS (use global VOICE_MAP for performance)
+        edge_voice = VOICE_MAP.get(voice_selection, "vi-VN-HoaiMyNeural")
+        audio_data = asyncio.run(generate_speech(translated_text, edge_voice, 0.0))
+        # Save audio file
+        fd, temp_output_path = tempfile.mkstemp(suffix=".wav", prefix="translated_audio_")
+        os.close(fd)
+        # Write raw audio data to temporary file
+        with open(temp_output_path, 'wb') as f:
+            f.write(audio_data)
+        # Create text file for download
+        text_file_path = create_text_file(translated_text, text_format, "translated_content")
+        return transcription, detected_lang, translated_text, target_language, temp_output_path, transcription, translated_text, text_file_path
+    except Exception as e:
+        # Get target language for error response
+        target_language = get_target_language_from_voice(voice_selection) if 'voice_selection' in locals() else "Vietnamese"
+        return f"Lỗi: {str(e)}", "Lỗi", "", target_language, None, "", "", None
+# Voice choices organized by country - ONLY OFFICIAL VOICES
+voice_choices_by_country = {
+    "🇻🇳 Việt Nam": [
+        "🇻🇳 HoaiMy - Nữ Việt Chuẩn",
+        "🇻🇳 NamMinh - Nam Việt Chuẩn"
+    ],
+    "🇺🇸 Hoa Kỳ": [
+        "🇺🇸 Aria - Nữ Mỹ",
+        "🇺🇸 Guy - Nam Mỹ"
+    ],
+    "🇬🇧 Anh": [
+        "🇬🇧 Sonia - Nữ Anh",
+        "🇬🇧 Ryan - Nam Anh"
+    ],
+    "🇩🇪 Đức": [
+        "🇩🇪 Katja - Deutsche Frau",
+        "🇩🇪 Conrad - Deutscher Mann"
+    ],
+    "🇫🇷 Pháp": [
+        "🇫🇷 Denise - Française",
+        "🇫🇷 Henri - Français"
+    ],
+    "🇪🇸 Tây Ban Nha": [
+        "🇪🇸 Elvira - Española",
+        "🇪🇸 Alvaro - Español"
+    ],
+    "🇮🇹 Ý": [
+        "🇮🇹 Elsa - Italiana",
+        "🇮🇹 Diego - Italiano"
+    ],
+    "🇯🇵 Nhật Bản": [
+        "🇯🇵 Nanami - 日本女性",
+        "🇯🇵 Keita - 日本男性"
+    ],
+    "🇰🇷 Hàn Quốc": [
+        "🇰🇷 SunHi - 한국 여성",
+        "🇰🇷 BongJin - 한국 남성"
+    ],
+    "🇨🇳 Trung Quốc": [
+        "🇨🇳 Xiaoxiao - 中文女声",
+        "🇨🇳 Yunxi - 中文男声"
+    ],
+    "🇷🇺 Nga": [
+        "🇷🇺 Svetlana - Русская",
+        "🇷🇺 Dmitry - Русский"
+    ],
+    "🇵🇹 Bồ Đào Nha": [
+        "🇵🇹 Francisca - Portuguesa",
+        "🇵🇹 Antonio - Português"
+    ],
+    "🇸🇦 Ả Rập": [
+        "🇸🇦 Zariyah - عربية",
+        "🇸🇦 Hamed - عربي"
+    ]
+}
+def update_voices(country):
+    """Update voice choices based on selected country"""
+    if country in voice_choices_by_country:
+        voices = voice_choices_by_country[country]
+        return gr.Dropdown(choices=voices, value=voices[0])
+    else:
+        # Default to Vietnamese voices
+        default_voices = voice_choices_by_country["🇻🇳 Việt Nam"]
+        return gr.Dropdown(choices=default_voices, value=default_voices[0])
+# Lightweight CSS - optimized for performance
+css = """
+* {
+    font-family: system-ui, -apple-system, 'Segoe UI', Arial, sans-serif;
+}
+.gradio-container {
+    max-width: 1200px;
+    margin: 0 auto;
+    position: relative;
+}
+/* Critical fix for dropdown interaction */
+.gradio-container * {
+    pointer-events: auto;
+}
+/* Hide Gradio footer */
+.footer {
+    display: none !important;
+}
+/* Custom footer to cover Gradio attribution */
+.custom-footer {
+    position: fixed;
+    bottom: 0;
+    left: 0;
+    right: 0;
+    background: linear-gradient(135deg, #4A90E2 0%, #2E86AB 70%, #FF8A65 85%, #FF6B9D 100%);
+    color: white;
+    padding: 15px;
+    text-align: center;
+    font-weight: bold;
+    z-index: 1000;
+    box-shadow: 0 -2px 10px rgba(0,0,0,0.1);
+}
+/* Add padding to body to account for fixed footer */
+body {
+    padding-bottom: 60px;
+}
+/* Mobile-first responsive design */
+.input-card {
+    background: rgba(255,255,255,0.95);
+    border-radius: 16px;
+    padding: 16px;
+    margin: 10px 0;
+    box-shadow: 0 4px 20px rgba(0,0,0,0.1);
+    backdrop-filter: blur(10px);
+}
+.output-area {
+    background: rgba(255,255,255,0.95);
+    border-radius: 16px;
+    padding: 16px;
+    margin: 15px 0;
+    min-height: 200px;
+    box-shadow: 0 4px 20px rgba(0,0,0,0.1);
+}
+.examples-section {
+    background: rgba(255,255,255,0.9);
+    border-radius: 16px;
+    padding: 16px;
+    margin: 20px 0;
+}
+.main-header {
+    background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+    color: white;
+    padding: 20px;
+    border-radius: 10px;
+    margin-bottom: 20px;
+    text-align: center;
+}
+.feature-box {
+    background: #f8f9fa;
+    padding: 15px;
+    border-radius: 8px;
+    margin: 10px 0;
+    border-left: 4px solid #667eea;
+}
+.status-indicator {
+    display: inline-block;
+    padding: 5px 10px;
+    border-radius: 15px;
+    font-size: 12px;
+    font-weight: bold;
+    margin: 5px;
+}
+.status-success {
+    background-color: #d4edda;
+    color: #155724;
+}
+.status-processing {
+    background-color: #fff3cd;
+    color: #856404;
+}
+.comparison-section {
+    border: 1px solid #e0e0e0;
+    border-radius: 8px;
+    padding: 15px;
+    margin: 10px 0;
+    background: #fafafa;
+}
+.language-label {
+    font-weight: bold;
+    color: #667eea;
+    padding: 5px 10px;
+    background: #f0f2ff;
+    border-radius: 15px;
+    display: inline-block;
+    margin-bottom: 10px;
+    font-size: 14px;
+}
+.content-compare {
+    background: white;
+    border: 1px solid #ddd;
+    border-radius: 6px;
+    padding: 12px;
+    min-height: 120px;
+    font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
+    line-height: 1.5;
+}
+/* Reset any problematic dropdown styles */
+.gradio-container * {
+    pointer-events: auto;
+}
+/* Remove any potential blocking overlays */
+.gradio-container::before,
+.gradio-container::after {
+    display: none;
+}
+/* Ensure all interactive elements work */
+button, select, input, textarea, .gr-dropdown {
+    pointer-events: auto !important;
+    position: relative !important;
+}
+/* Simple dropdown fix without complex selectors */
+[class*="dropdown"] {
+    position: relative !important;
+    z-index: 999 !important;
+}
+[class*="dropdown"] * {
+    pointer-events: auto !important;
+}
+/* Make sure no overlay blocks clicks */
+.gradio-container .gr-form {
+    position: relative;
+    z-index: 1;
+}
+.gradio-container .gr-block {
+    position: relative;
+    z-index: 1;
+}
+.mobile-button {
+    width: 100% !important;
+    padding: 15px !important;
+    font-size: 1.1em !important;
+    margin: 20px 0 !important;
+    border-radius: 12px !important;
+    background: linear-gradient(135deg, #667eea 0%, #764ba2 100%) !important;
+    border: none !important;
+    color: white !important;
+    font-weight: bold !important;
+    box-shadow: 0 4px 15px rgba(102, 126, 234, 0.3) !important;
+    transition: all 0.3s ease !important;
+    cursor: pointer !important;
+    position: relative !important;
+    overflow: hidden !important;
+}
+.mobile-button:hover {
+    transform: translateY(-2px) !important;
+    box-shadow: 0 8px 25px rgba(102, 126, 234, 0.4) !important;
+    background: linear-gradient(135deg, #5a6fd8 0%, #6b4190 100%) !important;
+}
+.mobile-button:active {
+    transform: translateY(0px) !important;
+    box-shadow: 0 2px 10px rgba(102, 126, 234, 0.3) !important;
+}
+/* Ripple effect for button */
+.mobile-button::before {
+    content: '';
+    position: absolute;
+    top: 50%;
+    left: 50%;
+    width: 0;
+    height: 0;
+    border-radius: 50%;
+    background: rgba(255, 255, 255, 0.3);
+    transform: translate(-50%, -50%);
+    transition: width 0.6s, height 0.6s;
+}
+.mobile-button:active::before {
+    width: 300px;
+    height: 300px;
+}
+/* Loading spinner animation */
+@keyframes spin {
+    0% { transform: rotate(0deg); }
+    100% { transform: rotate(360deg); }
+}
+.loading-spinner {
+    display: inline-block;
+    width: 20px;
+    height: 20px;
+    border: 3px solid rgba(255,255,255,0.3);
+    border-radius: 50%;
+    border-top-color: white;
+    animation: spin 1s ease-in-out infinite;
+    margin-right: 10px;
+}
+/* Button pulse effect when processing */
+@keyframes pulse {
+    0% {
+        box-shadow: 0 4px 15px rgba(102, 126, 234, 0.3);
+    }
+    50% {
+        box-shadow: 0 8px 25px rgba(102, 126, 234, 0.6);
+    }
+    100% {
+        box-shadow: 0 4px 15px rgba(102, 126, 234, 0.3);
+    }
+}
+.button-processing {
+    animation: pulse 2s ease-in-out infinite;
+    background: linear-gradient(135deg, #FF8E53 0%, #FF6B6B 100%) !important;
+}
+.mobile-textbox textarea {
+    border-radius: 10px !important;
+    border: 2px solid #e0e0e0 !important;
+    padding: 12px !important;
+    font-size: 1em !important;
+    line-height: 1.5 !important;
+}
+.mobile-compare textarea {
+    border-radius: 8px !important;
+    border: 1px solid #ddd !important;
+    padding: 10px !important;
+    background: #fafafa !important;
+    font-size: 0.95em !important;
+}
+.mobile-audio {
+    margin: 10px 0 !important;
+    border-radius: 10px !important;
+}
+.mobile-file {
+    margin: 10px 0 !important;
+    border-radius: 10px !important;
+}
+/* Mobile responsive breakpoints */
+@media (max-width: 768px) {
+    .gradio-container {
+        padding: 10px !important;
+    }
+    .input-card {
+        padding: 12px !important;
+        margin: 8px 0 !important;
+    }
+    .output-area {
+        padding: 12px !important;
+        margin: 10px 0 !important;
+    }
+    .examples-section {
+        padding: 12px !important;
+    }
+    .main-header h2 {
+        font-size: 1.5em !important;
+    }
+    .main-header p {
+        font-size: 1em !important;
+    }
+    /* Mobile layout adjustments - less aggressive */
+    .gr-row {
+        flex-direction: column;
+    }
+    .gr-column {
+        width: 100%;
+        margin-bottom: 15px;
+    }
+}
+@media (max-width: 480px) {
+    .gradio-container {
+        padding: 5px !important;
+    }
+    .input-card {
+        padding: 10px !important;
+        margin: 5px 0 !important;
+    }
+    .main-header {
+        padding: 15px !important;
+    }
+    .main-header h2 {
+        font-size: 1.3em !important;
+    }
+    .mobile-button {
+        padding: 12px !important;
+        font-size: 1em !important;
+    }
+}
+/* JavaScript for button interactions */
+"""
+# Add JavaScript for button effects
+js_code = """
+<script>
+function addButtonEffects() {
+    // Find button by class since Gradio might change IDs
+    const buttons = document.querySelectorAll('.mobile-button');
+    buttons.forEach(button => {
+        // Remove existing listeners to avoid duplicates
+        button.removeEventListener('click', handleClick);
+        // Add enhanced click effect
+        button.addEventListener('click', handleClick);
+        // Add hover effects for better interaction
+        button.addEventListener('mouseenter', function() {
+            if (!this.disabled) {
+                this.style.transform = 'translateY(-2px) scale(1.02)';
+            }
+        });
+        button.addEventListener('mouseleave', function() {
+            if (!this.disabled) {
+                this.style.transform = 'translateY(0) scale(1)';
+            }
+        });
+    });
+}
+function handleClick(e) {
+    const button = e.target;
+    // Immediate visual feedback
+    button.style.transform = 'scale(0.98)';
+    button.style.transition = 'all 0.1s ease';
+    setTimeout(() => {
+        button.style.transform = 'scale(1)';
+        button.style.transition = 'all 0.3s ease';
+    }, 100);
+    // Add processing state
+    const originalText = button.innerHTML;
+    button.innerHTML = '<span class="loading-spinner"></span>⏳ ĐANG XỬ LÝ...';
+    button.classList.add('button-processing');
+    button.disabled = true;
+    // Monitor for completion and reset
+    let checkCount = 0;
+    const checkInterval = setInterval(() => {
+        checkCount++;
+        // Reset after 15 seconds max or if status changes
+        const statusElements = document.querySelectorAll('[style*="Hoàn thành"]');
+        if (statusElements.length > 0 || checkCount > 50) {
+            clearInterval(checkInterval);
+            button.innerHTML = originalText;
+            button.classList.remove('button-processing');
+            button.disabled = false;
+            button.style.transform = 'scale(1)';
+        }
+    }, 300);
+}
+// Initialize when DOM is ready
+if (document.readyState === 'loading') {
+    document.addEventListener('DOMContentLoaded', addButtonEffects);
+} else {
+    addButtonEffects();
+}
+// Re-initialize periodically for Gradio updates
+setInterval(addButtonEffects, 2000);
+</script>
+"""
+# Create interface with tabs
+with gr.Blocks(css=css, title="🎤 Voice Studio & Audio Translation") as demo:
+    # Header with iframe microphone permissions
+    gr.HTML("""
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <meta http-equiv="Permissions-Policy" content="microphone=*, camera=*, display-capture=*">
+    <meta http-equiv="Feature-Policy" content="microphone 'self' *; camera 'self' *">
+    <script>
+    // Request microphone permissions for iframe
+    if (window.location !== window.parent.location) {
+        // We're in an iframe
+        console.log('Running in iframe - requesting microphone permissions');
+        // Try to request microphone access
+        if (navigator.mediaDevices && navigator.mediaDevices.getUserMedia) {
+            navigator.mediaDevices.getUserMedia({ audio: true })
+                .then(function(stream) {
+                    console.log('Microphone access granted');
+                    // Stop the stream immediately, we just wanted to get permission
+                    stream.getTracks().forEach(track => track.stop());
+                })
+                .catch(function(err) {
+                    console.log('Microphone access denied:', err);
+                    // Show user-friendly message
+                    const message = document.createElement('div');
+                    message.innerHTML = `
+                        <div style="
+                            background: #fff3cd;
+                            color: #856404;
+                            padding: 15px;
+                            border-radius: 8px;
+                            margin: 10px;
+                            border: 1px solid #ffeaa7;
+                            text-align: center;
+                        ">
+                            <strong>⚠️ Microphone Access Required</strong><br>
+                            To use recording features, please allow microphone access or open this app in a new window.
+                            <br><br>
+                            <a href="${window.location.href}" target="_blank" style="
+                                background: #667eea;
+                                color: white;
+                                padding: 8px 16px;
+                                text-decoration: none;
+                                border-radius: 6px;
+                                display: inline-block;
+                                margin-top: 10px;
+                            ">🔗 Open in New Window</a>
+                        </div>
+                    `;
+                    document.body.insertBefore(message, document.body.firstChild);
+                });
+        }
+    }
+    </script>
+    <div style="text-align: center; background: linear-gradient(135deg, #4A90E2 0%, #FF6B9D 100%); color: white; padding: 20px; border-radius: 10px; margin-bottom: 20px;">
+        <h1>🎤 Voice Studio & Audio Translation</h1>
+        <p>Chuyển văn bản thành giọng nói, dịch văn bản và dịch audio sang nhiều ngôn ngữ!</p>
+        <div style="margin-top: 10px; font-size: 14px; opacity: 0.9;">
+            ✨ Tính năng mới: Dịch văn bản trực tiếp trong Voice Studio
+        </div>
+        <div style="margin-top: 8px;">🧠 <strong>Digitized Brains</strong></div>
+    </div>
+    """)
+    with gr.Tabs():
+        # Voice Studio Tab
+        with gr.TabItem("🎤 Voice Studio"):
+            gr.HTML("""
+            <div style="display: flex; justify-content: center; gap: 15px; margin: 20px 0; flex-wrap: wrap;">
+                <div style="background: linear-gradient(135deg, #FF6B6B 0%, #FF8E53 100%); padding: 15px; border-radius: 10px; color: white; text-align: center; min-width: 150px;">
+                    <h4>🇻🇳 Tiếng Việt</h4>
+                    <p style="margin: 0; font-size: 12px;">2 giọng chuẩn</p>
+                    <p style="margin: 0; font-size: 10px;">HoaiMy • NamMinh</p>
+                </div>
+                <div style="background: linear-gradient(135deg, #4ECDC4 0%, #44A08D 100%); padding: 15px; border-radius: 10px; color: white; text-align: center; min-width: 150px;">
+                    <h4>🇺🇸🇬🇧 English</h4>
+                    <p style="margin: 0; font-size: 12px;">4 giọng chuẩn</p>
+                    <p style="margin: 0; font-size: 10px;">US • UK</p>
+                </div>
+                <div style="background: linear-gradient(135deg, #45B7D1 0%, #96C93D 100%); padding: 15px; border-radius: 10px; color: white; text-align: center; min-width: 150px;">
+                    <h4>🌍 Đa ngôn ngữ</h4>
+                    <p style="margin: 0; font-size: 12px;">20 giọng chuẩn</p>
+                    <p style="margin: 0; font-size: 10px;">10 ngôn ngữ</p>
+                </div>
+            </div>
+            """)
+            gr.Markdown("### 📝 Nhập nội dung và chọn giọng nói")
+            with gr.Row():
+                text_input = gr.Textbox(
+                    placeholder="Nhập văn bản cần chuyển thành giọng nói...",
+                    lines=4,
+                    label="Văn bản",
+                    scale=2
+                )
+            with gr.Row():
+                with gr.Column(scale=1):
+                    country_dropdown = gr.Dropdown(
+                        choices=list(voice_choices_by_country.keys()),
+                        value="🇻��� Việt Nam",
+                        label="🌍 Chọn quốc gia"
+                    )
+                with gr.Column(scale=1):
+                    voice_dropdown = gr.Dropdown(
+                        choices=voice_choices_by_country["🇻🇳 Việt Nam"],
+                        value="🇻🇳 HoaiMy - Nữ Việt Chuẩn",
+                        label="🎭 Chọn giọng nói"
+                    )
+            with gr.Row():
+                speed_slider = gr.Slider(
+                    minimum=0.5,
+                    maximum=2.0,
+                    value=1.0,
+                    step=0.1,
+                    label="⚡ Tốc độ phát"
+                )
+            # Translation feature
+            with gr.Row():
+                with gr.Column(scale=1):
+                    translate_checkbox = gr.Checkbox(
+                        label="🌍 Dịch văn bản trước khi tạo giọng nói",
+                        value=False
+                    )
+                with gr.Column(scale=2):
+                    translate_btn = gr.Button("🔄 DỊCH VĂN BẢN", variant="secondary", size="lg", visible=False)
+            # Show translated text when translation is enabled
+            translated_text_output = gr.Textbox(
+                label="📝 Văn bản đã dịch",
+                lines=3,
+                interactive=True,
+                visible=False,
+                placeholder="Văn bản sau khi dịch sẽ hiển thị ở đây..."
+            )
+            generate_btn = gr.Button("🎵 TẠO GIỌNG NÓI", variant="primary", size="lg")
+            gr.Markdown("### 🎧 Kết quả âm thanh")
+            audio_output_vs = gr.HTML(
+                value="<p style='text-align: center; color: #666; padding: 40px;'>Nhấn 'TẠO GIỌNG NÓI' để bắt đầu 🎤</p>"
+            )
+            # Examples section
+            gr.Markdown("### 📚 Ví dụ nhanh")
+            with gr.Row():
+                example_vn = gr.Button("🇻🇳 Tiếng Việt", size="sm")
+                example_en = gr.Button("🇺🇸 English", size="sm")
+                example_de = gr.Button("🇩🇪 Deutsch", size="sm")
+                example_translate = gr.Button("🌍 Dịch thuật", size="sm")
+            # Example button functions
+            def load_vn_example():
+                return "Xin chào! Chào mừng bạn đến với studio giọng nói.", "🇻🇳 Việt Nam"
+            def load_en_example():
+                return "Hello! Welcome to our voice studio.", "🇺🇸 Hoa Kỳ"
+            def load_de_example():
+                return "Hallo! Willkommen in unserem Sprachstudio.", "🇩🇪 Đức"
+            def load_translate_example():
+                return "Hello! This is an example text for translation.", "🇺🇸 Hoa Kỳ", True
+            # Translation functions
+            def toggle_translation_ui(translate_enabled):
+                """Show/hide translation UI elements"""
+                return (
+                    gr.update(visible=translate_enabled),  # translate_btn
+                    gr.update(visible=translate_enabled)   # translated_text_output
+                )
+            def translate_text_interface(text, voice_selection):
+                """Translate text for Voice Studio"""
+                if not text.strip():
+                    return "Vui lòng nhập văn bản trước khi dịch"
+                target_language = get_target_language_from_voice(voice_selection)
+                translated = translate_text_with_gemini(text, target_language)
+                return translated
+            def create_voice_with_translation(original_text, translated_text, translate_enabled, voice_selection, speed):
+                """Create voice using original or translated text"""
+                if translate_enabled and translated_text.strip() and not translated_text.startswith("Lỗi"):
+                    # Use translated text
+                    return create_audio_voice_studio(translated_text, voice_selection, speed)
+                else:
+                    # Use original text
+                    return create_audio_voice_studio(original_text, voice_selection, speed)
+            # Event handlers for Voice Studio
+            country_dropdown.change(
+                fn=update_voices,
+                inputs=[country_dropdown],
+                outputs=[voice_dropdown]
+            )
+            example_vn.click(
+                fn=load_vn_example,
+                outputs=[text_input, country_dropdown]
+            )
+            example_en.click(
+                fn=load_en_example,
+                outputs=[text_input, country_dropdown]
+            )
+            example_de.click(
+                fn=load_de_example,
+                outputs=[text_input, country_dropdown]
+            )
+            example_translate.click(
+                fn=load_translate_example,
+                outputs=[text_input, country_dropdown, translate_checkbox]
+            )
+            # Translation UI toggle
+            translate_checkbox.change(
+                fn=toggle_translation_ui,
+                inputs=[translate_checkbox],
+                outputs=[translate_btn, translated_text_output]
+            )
+            # Translation button
+            translate_btn.click(
+                fn=translate_text_interface,
+                inputs=[text_input, voice_dropdown],
+                outputs=[translated_text_output]
+            )
+            # Generate voice with translation support
+            generate_btn.click(
+                fn=create_voice_with_translation,
+                inputs=[text_input, translated_text_output, translate_checkbox, voice_dropdown, speed_slider],
+                outputs=[audio_output_vs]
+            )
+        # Audio Translation Tab
+        with gr.TabItem("🎙️ Audio Translation"):
+            # Colorful feature cards like Voice Studio
+            gr.HTML("""
+            <div style="display: flex; justify-content: center; gap: 15px; margin: 20px 0; flex-wrap: wrap;">
+                <div style="background: linear-gradient(135deg, #FF6B6B 0%, #FF8E53 100%); padding: 15px; border-radius: 10px; color: white; text-align: center; min-width: 150px;">
+                    <h4>🎤 Ghi âm</h4>
+                    <p style="margin: 0; font-size: 12px;">Microphone</p>
+                    <p style="margin: 0; font-size: 10px;">Real-time</p>
+                </div>
+                <div style="background: linear-gradient(135deg, #4ECDC4 0%, #44A08D 100%); padding: 15px; border-radius: 10px; color: white; text-align: center; min-width: 150px;">
+                    <h4>📁 Upload</h4>
+                    <p style="margin: 0; font-size: 12px;">Audio Files</p>
+                    <p style="margin: 0; font-size: 10px;">WAV • MP3</p>
+                </div>
+                <div style="background: linear-gradient(135deg, #45B7D1 0%, #96C93D 100%); padding: 15px; border-radius: 10px; color: white; text-align: center; min-width: 150px;">
+                    <h4>🔄 AI Dịch</h4>
+                    <p style="margin: 0; font-size: 12px;">13 ngôn ngữ</p>
+                    <p style="margin: 0; font-size: 10px;">Gemini 2.0</p>
+                </div>
+                <div style="background: linear-gradient(135deg, #A855F7 0%, #EC4899 100%); padding: 15px; border-radius: 10px; color: white; text-align: center; min-width: 150px;">
+                    <h4>🎵 Tổng hợp</h4>
+                    <p style="margin: 0; font-size: 12px;">Neural TTS</p>
+                    <p style="margin: 0; font-size: 10px;">26 giọng</p>
+                </div>
+            </div>
+            """)
+            # Input section with colorful design
+            gr.HTML("""
+            <div style="
+                background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+                color: white;
+                padding: 20px;
+                border-radius: 15px;
+                margin: 20px 0;
+                text-align: center;
+                box-shadow: 0 8px 32px rgba(0,0,0,0.2);
+            ">
+                <h3 style="margin: 0 0 10px 0;">🎤 Tải lên file audio hoặc ghi âm trực tiếp</h3>
+                <p style="margin: 0; opacity: 0.9; font-size: 0.95em;">
+                    Hỗ trợ file WAV, MP3 hoặc ghi âm real-time qua microphone
+                </p>
+            </div>
+            """)
+            # Microphone permission notice for iframe
+            gr.HTML("""
+            <div id="microphone-notice" style="
+                background: linear-gradient(135deg, #fff3cd 0%, #ffeaa7 100%);
+                color: #856404;
+                padding: 15px;
+                border-radius: 10px;
+                margin: 15px 0;
+                border: 1px solid #ffeaa7;
+                text-align: center;
+                display: none;
+            ">
+                <strong>🎤 Microphone Access</strong><br>
+                If recording doesn't work, it may be due to iframe restrictions.<br>
+                <a href="#" onclick="window.open(window.location.href, '_blank')" style="
+                    background: #667eea;
+                    color: white;
+                    padding: 8px 16px;
+                    text-decoration: none;
+                    border-radius: 6px;
+                    display: inline-block;
+                    margin-top: 8px;
+                ">🔗 Open in New Window</a>
+            </div>
+            <script>
+            // Show notice only if in iframe and microphone fails
+            if (window.location !== window.parent.location) {
+                setTimeout(() => {
+                    const notice = document.getElementById('microphone-notice');
+                    if (notice) notice.style.display = 'block';
+                }, 2000);
+            }
+            </script>
+            """)
+            audio_input = gr.Audio(
+                label="📎 Audio Input",
+                type="filepath",
+                sources=["upload", "microphone"],
+                show_label=False
+            )
+            # Settings section with gradient header
+            gr.HTML("""
+            <div style="
+                background: linear-gradient(135deg, #FF6B6B 0%, #FF8E53 100%);
+                color: white;
+                padding: 18px;
+                border-radius: 12px;
+                margin: 25px 0 20px 0;
+                text-align: center;
+                box-shadow: 0 6px 24px rgba(255,107,107,0.3);
+            ">
+                <h3 style="margin: 0 0 8px 0;">🌍 Cài đặt dịch thuật</h3>
+                <p style="margin: 0; opacity: 0.9; font-size: 0.9em;">
+                    Chọn ngôn ngữ đích và giọng nói cho kết quả dịch thuật
+                </p>
+            </div>
+            """)
+            # Separate dropdowns without complex wrappers to avoid CSS conflicts
+            target_country_dropdown = gr.Dropdown(
+                choices=list(voice_choices_by_country.keys()),
+                value="🇻🇳 Việt Nam",
+                label="🌍 Chọn quốc gia đích"
+            )
+            target_voice_dropdown = gr.Dropdown(
+                choices=voice_choices_by_country["🇻🇳 Việt Nam"],
+                value="🇻🇳 HoaiMy - Nữ Việt Chuẩn",
+                label="🎭 Chọn giọng nói đích"
+            )
+            text_format_dropdown = gr.Dropdown(
+                choices=["TXT (.txt)", "Word (.docx)"] if DOCX_AVAILABLE else ["TXT (.txt)"],
+                value="TXT (.txt)",
+                label="📄 Định dạng file văn bản"
+            )
+            # Colorful action button
+            gr.HTML("""
+            <div style="margin: 25px 0 15px 0; text-align: center;">
+                <div style="
+                    background: linear-gradient(135deg, #4ECDC4 0%, #44A08D 100%);
+                    color: white;
+                    padding: 12px 20px;
+                    border-radius: 8px;
+                    margin-bottom: 15px;
+                    box-shadow: 0 4px 15px rgba(78,205,196,0.3);
+                    display: inline-block;
+                ">
+                    <h4 style="margin: 0; font-size: 1em;">⚡ Sẵn sàng xử lý</h4>
+                </div>
+            </div>
+            """)
+            translate_btn = gr.Button(
+                "🔄 BẮT ĐẦU DỊCH",
+                variant="primary",
+                size="lg",
+                elem_classes=["mobile-button"],
+                elem_id="translate-btn"
+            )
+            # Results section with colorful headers
+            gr.HTML("""
+            <div style="
+                background: linear-gradient(135deg, #45B7D1 0%, #96C93D 100%);
+                color: white;
+                padding: 18px;
+                border-radius: 12px;
+                margin: 30px 0 20px 0;
+                text-align: center;
+                box-shadow: 0 6px 24px rgba(69,183,209,0.3);
+            ">
+                <h3 style="margin: 0 0 8px 0;">📊 Kết quả xử lý</h3>
+                <p style="margin: 0; opacity: 0.9; font-size: 0.9em;">
+                    Phiên âm, dịch thuật và tổng hợp giọng nói
+                </p>
+            </div>
+            """)
+            # Dynamic status indicator
+            status_text = gr.HTML("""
+            <div style="
+                text-align: center;
+                margin: 20px 0;
+                padding: 15px;
+                background: linear-gradient(135deg, #A855F7 0%, #EC4899 100%);
+                border-radius: 12px;
+                color: white;
+                box-shadow: 0 4px 15px rgba(168,85,247,0.3);
+            ">
+                <span style="font-weight: bold; font-size: 1.1em;">
+                    ✅ Sẵn sàng xử lý
+                </span>
+            </div>
+            """)
+            # Card-based layout for mobile
+            with gr.Column(elem_classes=["output-area"]):
+                # Original content card
+                gr.HTML("""
+                <div style="
+                    background: linear-gradient(135deg, #e3f2fd 0%, #bbdefb 100%);
+                    padding: 15px;
+                    border-radius: 12px;
+                    margin: 15px 0;
+                    border-left: 4px solid #2196F3;
+                ">
+                    <h4 style="margin: 0 0 10px 0; color: #1976D2;">📝 Nội dung gốc từ audio</h4>
+                </div>
+                """)
+                transcription_output = gr.Textbox(
+                    label="🎯 Phiên âm từ audio",
+                    lines=4,
+                    interactive=False,
+                    placeholder="Nội dung phiên âm từ file audio sẽ hiển thị ở đây...",
+                    elem_classes=["mobile-textbox"]
+                )
+                detected_language = gr.Textbox(
+                    label="🌐 Ngôn ngữ được phát hiện",
+                    lines=1,
+                    interactive=False,
+                    placeholder="Tự động nhận diện...",
+                    elem_classes=["mobile-textbox"]
+                )
+                # Translation result card
+                gr.HTML("""
+                <div style="
+                    background: linear-gradient(135deg, #e8f5e8 0%, #c8e6c9 100%);
+                    padding: 15px;
+                    border-radius: 12px;
+                    margin: 15px 0;
+                    border-left: 4px solid #4CAF50;
+                ">
+                    <h4 style="margin: 0 0 10px 0; color: #388E3C;">✨ Kết quả dịch thuật</h4>
+                </div>
+                """)
+                translation_output = gr.Textbox(
+                    label="🔄 Nội dung đã dịch",
+                    lines=4,
+                    interactive=False,
+                    placeholder="Bản dịch sẽ hiển thị ở đây...",
+                    elem_classes=["mobile-textbox"]
+                )
+                target_language_display = gr.Textbox(
+                    label="🎯 Ngôn ngữ đích",
+                    lines=1,
+                    interactive=False,
+                    placeholder="Chưa chọn...",
+                    elem_classes=["mobile-textbox"]
+                )
+                # Mobile-friendly comparison section
+                with gr.Accordion("🔍 So sánh nội dung", open=False):
+                    gr.HTML("""
+                    <div style="
+                        text-align: center;
+                        margin-bottom: 15px;
+                        padding: 10px;
+                        background: #f5f5f5;
+                        border-radius: 8px;
+                    ">
+                        <p style="color: #666; font-style: italic; margin: 0;">
+                            Xem nội dung gốc và bản dịch để so sánh
+                        </p>
+                    </div>
+                    """)
+                    # Stack vertically on mobile for better readability
+                    with gr.Column():
+                        gr.HTML("""
+                        <div style="
+                            background: #e3f2fd;
+                            padding: 10px;
+                            border-radius: 8px;
+                            margin: 10px 0;
+                            text-align: center;
+                            font-weight: bold;
+                            color: #1976D2;
+                        ">📝 Ngôn ngữ gốc</div>
+                        """)
+                        original_compare = gr.Textbox(
+                            label="",
+                            lines=4,
+                            interactive=False,
+                            show_label=False,
+                            placeholder="Nội dung phiên âm từ audio sẽ hiển thị ở đây...",
+                            elem_classes=["mobile-compare"]
+                        )
+                        gr.HTML("""
+                        <div style="
+                            background: #e8f5e8;
+                            padding: 10px;
+                            border-radius: 8px;
+                            margin: 15px 0 5px 0;
+                            text-align: center;
+                            font-weight: bold;
+                            color: #388E3C;
+                        ">✨ Sau khi dịch</div>
+                        """)
+                        translated_compare = gr.Textbox(
+                            label="",
+                            lines=4,
+                            interactive=False,
+                            show_label=False,
+                            placeholder="Nội dung sau khi dịch sẽ hiển thị ở đây...",
+                            elem_classes=["mobile-compare"]
+                        )
+                # Mobile-optimized download section
+                with gr.Accordion("💾 Tải xuống kết quả", open=True):
+                    gr.HTML("""
+                    <div style="
+                        background: linear-gradient(135deg, #fff3e0 0%, #ffcc80 100%);
+                        padding: 15px;
+                        border-radius: 12px;
+                        margin: 15px 0;
+                        border-left: 4px solid #FF9800;
+                        text-align: center;
+                    ">
+                        <h4 style="margin: 0 0 10px 0; color: #E65100;">💾 Tải xuống kết quả</h4>
+                        <p style="color: #BF360C; margin: 0; font-style: italic;">
+                            File audio và văn bản đã dịch
+                        </p>
+                    </div>
+                    """)
+                    # Stack downloads vertically for mobile
+                    with gr.Column():
+                        gr.HTML("""
+                        <div style="
+                            background: #e3f2fd;
+                            padding: 12px;
+                            border-radius: 8px;
+                            margin: 15px 0 10px 0;
+                            text-align: center;
+                            font-weight: bold;
+                            color: #1976D2;
+                        ">🔊 Audio đã dịch</div>
+                        """)
+                        audio_output_at = gr.Audio(
+                            label="",
+                            type="filepath",
+                            show_label=False,
+                            elem_classes=["mobile-audio"]
+                        )
+                        gr.HTML("""
+                        <div style="
+                            background: #e8f5e8;
+                            padding: 12px;
+                            border-radius: 8px;
+                            margin: 25px 0 10px 0;
+                            text-align: center;
+                            font-weight: bold;
+                            color: #388E3C;
+                        ">📄 Văn bản đã dịch</div>
+                        """)
+                        text_output = gr.File(
+                            label="",
+                            file_count="single",
+                            file_types=[".txt", ".docx"],
+                            show_label=False,
+                            elem_classes=["mobile-file"]
+                        )
+            # Event handlers for Audio Translation with colorful status
+            def update_status_processing():
+                return """
+                <div style="
+                    text-align: center;
+                    margin: 20px 0;
+                    padding: 15px;
+                    background: linear-gradient(135deg, #FF8E53 0%, #FF6B6B 100%);
+                    border-radius: 12px;
+                    color: white;
+                    box-shadow: 0 4px 15px rgba(255,142,83,0.3);
+                ">
+                    <span style="font-weight: bold; font-size: 1.1em;">
+                        ⏳ Đang xử lý...
+                    </span>
+                </div>
+                """
+            def update_status_complete():
+                return """
+                <div style="
+                    text-align: center;
+                    margin: 20px 0;
+                    padding: 15px;
+                    background: linear-gradient(135deg, #4ECDC4 0%, #44A08D 100%);
+                    border-radius: 12px;
+                    color: white;
+                    box-shadow: 0 4px 15px rgba(78,205,196,0.3);
+                ">
+                    <span style="font-weight: bold; font-size: 1.1em;">
+                        ✅ Hoàn thành!
+                    </span>
+                </div>
+                """
+            target_country_dropdown.change(
+                fn=update_voices,
+                inputs=[target_country_dropdown],
+                outputs=[target_voice_dropdown]
+            )
+            # Update target language display when dropdown changes
+            target_voice_dropdown.change(
+                fn=lambda voice: voice,
+                inputs=[target_voice_dropdown],
+                outputs=[target_language_display]
+            )
+            # Helper function to extract format
+            def get_format_from_dropdown(format_choice):
+                if "Word" in format_choice:
+                    return "docx"
+                return "txt"
+            translate_btn.click(
+                fn=lambda: update_status_processing(),
+                outputs=[status_text]
+            ).then(
+                fn=lambda audio, country, voice, fmt: translate_audio(audio, country, voice, get_format_from_dropdown(fmt)),
+                inputs=[audio_input, target_country_dropdown, target_voice_dropdown, text_format_dropdown],
+                outputs=[
+                    transcription_output,
+                    detected_language,
+                    translation_output,
+                    target_language_display,
+                    audio_output_at,
+                    original_compare,
+                    translated_compare,
+                    text_output
+                ]
+            ).then(
+                fn=lambda: update_status_complete(),
+                outputs=[status_text]
+            )
+    # Footer
+    gr.HTML("""
+    <div class="custom-footer">
+        <div style="display: flex; justify-content: center; align-items: center; gap: 15px; flex-wrap: wrap;">
+            <div style="display: flex; align-items: center; gap: 8px;">
+                <div style="background: rgba(255,255,255,0.2); padding: 8px 15px; border-radius: 20px; font-size: 16px;">
+                    🧠 DB
+                </div>
+                <span style="font-size: 18px; font-weight: bold;">Digitized Brains</span>
+            </div>
+            <div style="font-size: 14px; opacity: 0.9;">
+                Voice Studio - AI Powered
+            </div>
+        </div>
+    </div>
+    """)
+    # Add JavaScript for button effects
+    gr.HTML(js_code)
+if __name__ == "__main__":
+    import sys
+    import locale
+    import os
+    # Ensure UTF-8 encoding
+    if sys.platform == 'win32':
+        os.environ['PYTHONIOENCODING'] = 'utf-8'
+    # Set environment variables for iframe support
+    os.environ['GRADIO_ALLOW_FLAGGING'] = 'never'
+    os.environ['GRADIO_TEMP_DIR'] = '/tmp'
+    # Hugging Face Spaces configuration
+    port = int(os.environ.get("GRADIO_SERVER_PORT", 7860))
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=port,
+        share=False,
+        show_error=True
+    )