harshmle commited on
Commit
fe82c06
Β·
1 Parent(s): 35f0708

updated all files

Browse files
Files changed (2) hide show
  1. README.md +137 -7
  2. app.py +396 -0
README.md CHANGED
@@ -1,13 +1,143 @@
1
  ---
2
- title: STT
3
- emoji: πŸš€
4
- colorFrom: gray
5
- colorTo: red
6
  sdk: gradio
7
- sdk_version: 5.49.1
8
  app_file: app.py
9
  pinned: false
10
- short_description: test
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Ringg STT V0
3
+ emoji: πŸŽ™οΈ
4
+ colorFrom: purple
5
+ colorTo: blue
6
  sdk: gradio
7
+ sdk_version: 4.44.0
8
  app_file: app.py
9
  pinned: false
10
+ license: apache-2.0
11
+ tags:
12
+ - speech-to-text
13
+ - asr
14
+ - bilingual
15
+ - english
16
+ - hindi
17
+ - audio
18
+ - transcription
19
+ - ringg
20
+ - real-time
21
  ---
22
 
23
+ # πŸŽ™οΈ Ringg STT V0
24
+
25
+ **Bilingual Speech-to-Text for English & Hindi**
26
+
27
+ [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/RinggAI/Ringg-STT-V0)
28
+ [![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](https://opensource.org/licenses/Apache-2.0)
29
+
30
+ ## 🌟 Overview
31
+
32
+ Ringg STT V0 is a state-of-the-art speech-to-text system that provides real-time transcription for English and Hindi languages. Our model ranks **2nd place** among top bilingual ASR models, outperforming OpenAI Whisper Large-v3 and other leading solutions.
33
+
34
+ ## πŸ“Š Performance Benchmarks
35
+
36
+ | Model | Indic Norm WER ↓ | Whisper Norm WER ↓ |
37
+ |-------|------------------|---------------------|
38
+ | AI4Bharat | 18.55% | 63.31% |
39
+ | IndicWav2Vec (Winner) | β€” | β€” |
40
+ | **Ringg STT V0** | **21.03%** | **66.27%** |
41
+ | VakyanSh Wav2Vec2 | 24.06% | 66.34% |
42
+ | Whisper Large-v3 | 29.17% | 63.31% |
43
+ | Whisper Large-v2 | 37.50% | 66.27% |
44
+
45
+ **Lower WER (Word Error Rate) indicates better accuracy.** Ringg STT V0 achieves competitive performance while supporting bilingual transcription.
46
+
47
+ ## ✨ Features
48
+
49
+ - 🌐 **Bilingual Support**: Native support for English and Hindi speech recognition
50
+ - ⚑ **Real-time Streaming**: Instant transcription as you speak
51
+ - 🎯 **High Accuracy**: 2nd place among top bilingual ASR models
52
+ - πŸ“ **File Upload**: Support for various audio formats (WAV, MP3, FLAC, M4A, etc.)
53
+ - πŸš€ **Fast Processing**: Optimized for low-latency inference
54
+ - πŸ’¬ **Code-switching**: Handles mixed English-Hindi speech
55
+
56
+ ## 🎯 Model Details
57
+
58
+ | Specification | Details |
59
+ |--------------|---------|
60
+ | **Model Name** | Ringg STT V0 |
61
+ | **Languages** | English (EN) & Hindi (HI) |
62
+ | **Performance** | 2nd place among top models |
63
+ | **Sample Rate** | 16kHz |
64
+
65
+
66
+ ## πŸš€ Usage
67
+
68
+ ### Real-time Streaming
69
+ 1. Go to the **"Real-time Streaming"** tab
70
+ 2. Allow microphone permissions when prompted
71
+ 3. Start speaking in English or Hindi
72
+ 4. See real-time transcription appear
73
+
74
+ ### File Upload
75
+ 1. Go to the **"File Upload"** tab
76
+ 2. Upload your audio file (WAV, MP3, FLAC, M4A, etc.)
77
+ 3. Click **"Transcribe"**
78
+ 4. View the transcription result
79
+
80
+ ## πŸ’‘ Tips for Best Results
81
+
82
+ - **Audio Quality**: Use clear audio with minimal background noise
83
+ - **Speaking Style**: Speak naturally at a moderate pace
84
+ - **File Format**: 16kHz or higher sample rate recommended
85
+ - **Code-switching**: Model handles English-Hindi mixing, but accuracy is best when minimizing switches within sentences
86
+
87
+ ## πŸ“Š Use Cases
88
+
89
+ - πŸ€– Voice assistants and chatbots
90
+ - πŸ“ Meeting transcription
91
+ - 🎬 Content creation and subtitling
92
+ - β™Ώ Accessibility applications
93
+ - πŸ” Voice search and commands
94
+ - πŸ“ž Call center automation
95
+ - πŸŽ“ Educational tools
96
+ - 🌍 Multilingual communication
97
+
98
+ ## πŸ”§ Technical Details
99
+
100
+ ### Audio Processing
101
+ - **Input Format**: Mono audio, automatically resampled to 16kHz
102
+ - **Processing**: Chunked streaming with 3-second buffers
103
+ - **Latency**: ~2-3 seconds for real-time streaming
104
+ - **GPU Acceleration**: CUDA-enabled for faster inference
105
+
106
+ ### Supported Audio Formats
107
+ - WAV (PCM, 16-bit, 24-bit, 32-bit)
108
+ - MP3
109
+ - FLAC
110
+ - M4A
111
+ - OGG
112
+ - OPUS
113
+
114
+ ## πŸ“ Limitations
115
+
116
+ - Works best with clear audio and minimal background noise
117
+ - Accuracy may vary with strong accents and dialects
118
+ - Code-switching within sentences may occasionally affect accuracy
119
+ - Very long audio files may take longer to process
120
+
121
+
122
+ ## πŸ“ˆ Performance
123
+
124
+ - **WER (Word Error Rate)**: Optimized for conversational speech
125
+ - **RTF (Real-Time Factor)**: < 0.3 on GPU (faster than real-time)
126
+ - **Languages**: English & Hindi with native support
127
+
128
+ ## πŸ”— Links
129
+
130
+ - **Organization**: [RinggAI on Hugging Face](https://huggingface.co/RinggAI)
131
+ - **TTS Space**: [Ringg TTS V0](https://huggingface.co/spaces/RinggAI/Ringg-TTS-v0.0)
132
+
133
+
134
+
135
+
136
+ ## πŸ‘₯ Team
137
+
138
+ Made with ❀️ by the **RinggAI Team**
139
+
140
+ ---
141
+
142
+ **Note**: This model is designed for research and development purposes. For production use, please ensure compliance with your local regulations regarding speech processing and data privacy.
143
+
app.py ADDED
@@ -0,0 +1,396 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Ringg STT V0 - Hugging Face Space (Frontend)
4
+ Makes API calls to private inference endpoint via ngrok
5
+ """
6
+
7
+ import os
8
+ import numpy as np
9
+ import gradio as gr
10
+ import requests
11
+ import base64
12
+ import io
13
+ from typing import Optional
14
+
15
+ # Custom CSS for Ringg branding
16
+ custom_css = """
17
+ .gradio-container {
18
+ font-family: 'Inter', sans-serif;
19
+ }
20
+ .main-header {
21
+ text-align: center;
22
+ padding: 20px;
23
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
24
+ color: white;
25
+ border-radius: 10px;
26
+ margin-bottom: 20px;
27
+ }
28
+ """
29
+
30
+ # Backend API endpoint (ngrok URL)
31
+ # You can update this via Hugging Face Space Secrets
32
+ API_ENDPOINT = os.environ.get("STT_API_ENDPOINT", "https://unintuitional-vibrational-jordy.ngrok-free.dev")
33
+
34
+ class RinggSTTClient:
35
+ """Client for Ringg STT API"""
36
+
37
+ def __init__(self, api_endpoint: str):
38
+ self.api_endpoint = api_endpoint.rstrip('/')
39
+ self.session = requests.Session()
40
+ self.session.headers.update({
41
+ 'User-Agent': 'RinggSTT-HF-Space/1.0'
42
+ })
43
+
44
+ def check_health(self) -> dict:
45
+ """Check if the API is available"""
46
+ try:
47
+ response = self.session.get(
48
+ f"{self.api_endpoint}/health",
49
+ timeout=5
50
+ )
51
+ if response.status_code == 200:
52
+ return {"status": "healthy", "message": "βœ… API is online"}
53
+ else:
54
+ return {"status": "error", "message": f"❌ API returned status {response.status_code}"}
55
+ except requests.exceptions.Timeout:
56
+ return {"status": "error", "message": "⏱️ API request timed out"}
57
+ except requests.exceptions.ConnectionError:
58
+ return {"status": "error", "message": "❌ Cannot connect to API"}
59
+ except Exception as e:
60
+ return {"status": "error", "message": f"❌ Error: {str(e)}"}
61
+
62
+ def transcribe_audio(self, audio_file_path: str) -> str:
63
+ """Transcribe audio file via API"""
64
+ try:
65
+ # Read audio file and encode as base64
66
+ with open(audio_file_path, 'rb') as f:
67
+ audio_data = f.read()
68
+
69
+ audio_base64 = base64.b64encode(audio_data).decode('utf-8')
70
+
71
+ # Make API request
72
+ response = self.session.post(
73
+ f"{self.api_endpoint}/transcribe",
74
+ json={
75
+ "audio_data": audio_base64,
76
+ "sample_rate": 16000
77
+ },
78
+ timeout=30
79
+ )
80
+
81
+ if response.status_code == 200:
82
+ result = response.json()
83
+ return result.get("transcription", "No transcription received")
84
+ else:
85
+ return f"❌ API Error: {response.status_code} - {response.text}"
86
+
87
+ except requests.exceptions.Timeout:
88
+ return "⏱️ Request timed out. The audio file might be too long."
89
+ except requests.exceptions.ConnectionError:
90
+ return "❌ Cannot connect to the transcription service. Please try again later."
91
+ except Exception as e:
92
+ return f"❌ Error: {str(e)}"
93
+
94
+ def transcribe_streaming(self, audio_chunk: np.ndarray) -> Optional[str]:
95
+ """Send audio chunk for streaming transcription"""
96
+ try:
97
+ # Convert numpy array to base64
98
+ audio_bytes = audio_chunk.tobytes()
99
+ audio_base64 = base64.b64encode(audio_bytes).decode('utf-8')
100
+
101
+ response = self.session.post(
102
+ f"{self.api_endpoint}/transcribe_stream",
103
+ json={
104
+ "audio_chunk": audio_base64,
105
+ "dtype": str(audio_chunk.dtype),
106
+ "shape": list(audio_chunk.shape)
107
+ },
108
+ timeout=10
109
+ )
110
+
111
+ if response.status_code == 200:
112
+ result = response.json()
113
+ return result.get("transcription")
114
+ return None
115
+
116
+ except Exception as e:
117
+ print(f"Streaming error: {e}")
118
+ return None
119
+
120
+
121
+ # Initialize API client
122
+ print(f"πŸ”— Connecting to STT API: {API_ENDPOINT}")
123
+ stt_client = RinggSTTClient(API_ENDPOINT)
124
+
125
+ # Check health on startup
126
+ health_status = stt_client.check_health()
127
+ print(f"API Health: {health_status}")
128
+
129
+
130
+ def create_interface():
131
+ """Create Gradio interface"""
132
+
133
+ def transcribe_audio(audio_file):
134
+ """Transcribe uploaded audio"""
135
+ if audio_file is None:
136
+ return "Please upload an audio file!"
137
+
138
+ return stt_client.transcribe_audio(audio_file)
139
+
140
+ def stream_audio(audio, state):
141
+ """Handle streaming audio"""
142
+ if audio is None:
143
+ return "No audio input", state
144
+
145
+ try:
146
+ if state is None:
147
+ state = {"transcripts": []}
148
+
149
+ if isinstance(audio, tuple):
150
+ sample_rate, audio_array = audio
151
+ else:
152
+ audio_array = audio
153
+ sample_rate = 16000
154
+
155
+ if audio_array is not None and len(audio_array) > 0:
156
+ if len(audio_array.shape) > 1:
157
+ audio_array = np.mean(audio_array, axis=1)
158
+
159
+ audio_array = audio_array.astype(np.float32)
160
+ max_abs = np.max(np.abs(audio_array)) if audio_array.size else 0.0
161
+ if max_abs > 1e-6:
162
+ audio_array = audio_array / max_abs
163
+
164
+ # Send to API
165
+ transcript = stt_client.transcribe_streaming(audio_array)
166
+
167
+ if transcript and transcript.strip():
168
+ if not state["transcripts"] or transcript != state["transcripts"][-1]:
169
+ state["transcripts"].append(transcript)
170
+
171
+ combined = " ".join(state["transcripts"]) if state["transcripts"] else "🎀 Listening..."
172
+ return combined, state
173
+
174
+ except Exception as e:
175
+ return f"❌ Error: {str(e)}", state
176
+
177
+ def check_api_status():
178
+ """Check API health status"""
179
+ health = stt_client.check_health()
180
+ return health["message"]
181
+
182
+ # Create interface
183
+ with gr.Blocks(title="Ringg STT V0", theme=gr.themes.Soft(), css=custom_css) as demo:
184
+ gr.Markdown("""
185
+ <div class="main-header">
186
+ <h1>πŸŽ™οΈ Ringg STT V0</h1>
187
+ <p>State-of-the-Art Bilingual Speech-to-Text (English & Hindi)</p>
188
+ </div>
189
+ """)
190
+
191
+ # Performance Comparison Table
192
+ gr.Markdown("""
193
+ ## Performance Benchmarks
194
+
195
+ Our model achieves **state-of-the-art performance** on English-Hindi bilingual speech recognition:
196
+ """)
197
+
198
+ with gr.Row():
199
+ gr.DataFrame(
200
+ value=[
201
+ ["AI4Bharat", "18.55%", "63.31%"],
202
+ ["IndicWav2Vec (Winner)", "β€”", "β€”"],
203
+ ["Ringg STT V0", "21.03%", "66.27%"],
204
+ ["VakyanSh Wav2Vec2", "24.06%", "66.34%"],
205
+ ["Whisper Large-v3", "29.17%", "63.31%"],
206
+ ["Whisper Large-v2", "37.50%", "66.27%"],
207
+ ],
208
+ headers=["Model", "Indic Norm WER ↓", "Whisper Norm WER ↓"],
209
+ datatype=["str", "str", "str"],
210
+ row_count=6,
211
+ col_count=(3, "fixed"),
212
+ label="Word Error Rate Comparison (Lower is Better)"
213
+ )
214
+
215
+ gr.Markdown("""
216
+ **Ringg STT V0** ranks **2nd** among top models, outperforming OpenAI Whisper Large-v3 and other leading solutions.
217
+
218
+ Lower WER (Word Error Rate) indicates better accuracy. Our model achieves competitive performance while supporting bilingual transcription.
219
+ """)
220
+
221
+ gr.Markdown("""
222
+ ### ✨ Features
223
+ - 🌐 **Bilingual Support**: Transcribe English and Hindi speech
224
+ - ⚑ **Real-time Processing**: Instant transcription as you speak
225
+ - 🎯 **High Accuracy**: Competitive with leading ASR models
226
+ - πŸ“ **File Upload**: Support for various audio formats (WAV, MP3, FLAC, etc.)
227
+ - πŸ”’ **Private Infrastructure**: Secure and controlled deployment
228
+ """)
229
+
230
+ # API Status indicator
231
+ with gr.Row():
232
+ with gr.Column(scale=4):
233
+ api_status = gr.Textbox(
234
+ label="πŸ”Œ API Status",
235
+ value=health_status["message"],
236
+ interactive=False
237
+ )
238
+ with gr.Column(scale=1):
239
+ check_btn = gr.Button("πŸ”„ Check Status", size="sm")
240
+ check_btn.click(check_api_status, outputs=api_status)
241
+
242
+ with gr.Tab("🎀 Real-time Streaming"):
243
+ gr.Markdown("### Live Microphone Transcription")
244
+ gr.Markdown("Speak into your microphone for real-time transcription in English or Hindi.")
245
+
246
+ gr.Markdown("""
247
+ ⚠️ **Note**: Real-time streaming sends audio chunks to the API endpoint.
248
+ Make sure your backend service is running and accessible.
249
+ """)
250
+
251
+ mic_input = gr.Audio(
252
+ sources=["microphone"],
253
+ type="numpy",
254
+ streaming=True,
255
+ label="🎀 Microphone Input"
256
+ )
257
+
258
+ live_output = gr.Textbox(
259
+ label="Live Transcription",
260
+ lines=8,
261
+ interactive=False,
262
+ placeholder="Your transcription will appear here..."
263
+ )
264
+
265
+ session_state = gr.State(lambda: None)
266
+
267
+ mic_input.stream(
268
+ fn=stream_audio,
269
+ inputs=[mic_input, session_state],
270
+ outputs=[live_output, session_state],
271
+ stream_every=0.5
272
+ )
273
+
274
+ with gr.Tab("πŸ“ File Upload"):
275
+ gr.Markdown("### Upload Audio File")
276
+ gr.Markdown("Upload an audio file for transcription (supports WAV, MP3, FLAC, M4A, etc.)")
277
+
278
+ audio_input = gr.Audio(
279
+ label="πŸ“ Upload Audio File",
280
+ type="filepath",
281
+ sources=["upload"]
282
+ )
283
+
284
+ transcribe_btn = gr.Button("πŸ”„ Transcribe", variant="primary", size="lg")
285
+
286
+ file_output = gr.Textbox(
287
+ label="Transcription Result",
288
+ lines=8,
289
+ interactive=False,
290
+ placeholder="Upload a file and click Transcribe..."
291
+ )
292
+
293
+ transcribe_btn.click(
294
+ transcribe_audio,
295
+ inputs=audio_input,
296
+ outputs=file_output
297
+ )
298
+
299
+ gr.Markdown("""
300
+ ### πŸ’‘ Tips for Best Results
301
+ - Use clear audio with minimal background noise
302
+ - Speak naturally at a moderate pace
303
+ - For file upload, ensure audio quality is good (16kHz or higher recommended)
304
+ - Model handles code-switching between English and Hindi
305
+ """)
306
+
307
+ with gr.Tab("βš™οΈ Configuration"):
308
+ gr.Markdown("### API Endpoint Configuration")
309
+ gr.Markdown(f"""
310
+ **Current API Endpoint**: `{API_ENDPOINT}`
311
+
312
+ The transcription service runs on a private infrastructure and is accessed via a secure API endpoint.
313
+
314
+ #### How it Works:
315
+ 1. 🎀 You interact with this Hugging Face Space (frontend)
316
+ 2. πŸ“‘ Audio is sent to the private API endpoint
317
+ 3. πŸ€– The model processes the audio on secure infrastructure
318
+ 4. πŸ“ Transcription is returned and displayed
319
+
320
+ #### Benefits:
321
+ - πŸ”’ **Privacy**: Model and data stay on private infrastructure
322
+ - ⚑ **Performance**: Dedicated compute resources
323
+ - 🎯 **Control**: Full control over the model and processing
324
+ - πŸ’° **Cost-effective**: Use your own compute resources
325
+
326
+ To update the API endpoint, set the `STT_API_ENDPOINT` environment variable in Space Settings.
327
+ """)
328
+
329
+ with gr.Tab("ℹ️ About"):
330
+ gr.Markdown("""
331
+ ## About Ringg STT V0
332
+
333
+ Ringg STT V0 is a state-of-the-art speech-to-text system for English and Hindi languages.
334
+
335
+ ### 🎯 Model Details
336
+ - **Model**: Ringg STT V0
337
+ - **Languages**: English (EN) & Hindi (HI)
338
+ - **Sample Rate**: 16kHz
339
+ - **Performance**: 2nd place among top bilingual ASR models
340
+ - **Framework**: PyTorch-based deep learning
341
+
342
+ ### πŸ—οΈ Architecture
343
+
344
+ This Space uses a **frontend-backend architecture**:
345
+
346
+ ```
347
+ User β†’ HF Space (Frontend) β†’ API Endpoint β†’ Private Server (Model) β†’ Response
348
+ ```
349
+
350
+ - **Frontend**: This Hugging Face Space (Gradio UI)
351
+ - **Backend**: Private inference server with the actual model
352
+ - **Connection**: Secure API calls via ngrok/tunnel
353
+
354
+ ### πŸš€ Key Features
355
+ - **Bilingual Recognition**: Native support for English and Hindi
356
+ - **Real-time Streaming**: Low-latency transcription
357
+ - **High Accuracy**: Optimized for conversational speech
358
+ - **Flexible Input**: Supports microphone streaming and file upload
359
+ - **Private Infrastructure**: Model runs on your own infrastructure
360
+
361
+ ### πŸ“Š Use Cases
362
+ - Voice assistants and chatbots
363
+ - Meeting transcription
364
+ - Content creation and subtitling
365
+ - Accessibility applications
366
+ - Voice search and commands
367
+
368
+ ### πŸ”§ Technical Specifications
369
+ - **Audio Processing**: 16kHz mono, PCM16
370
+ - **Latency**: ~2-3 seconds for streaming
371
+ - **API Protocol**: REST API with base64-encoded audio
372
+ - **Supported Formats**: WAV, MP3, FLAC, M4A, OGG, OPUS
373
+
374
+ ### πŸ“ Limitations
375
+ - Requires active backend API endpoint
376
+ - Works best with clear audio and minimal background noise
377
+ - Accuracy may vary with accents and dialects
378
+ - API latency depends on network and backend performance
379
+
380
+ ### πŸ”— Links
381
+ - **Organization**: [RinggAI on Hugging Face](https://huggingface.co/RinggAI)
382
+ - **TTS Space**: [Ringg TTS V0](https://huggingface.co/spaces/RinggAI/Ringg-TTS-v0.0)
383
+
384
+ ---
385
+
386
+ Made with ❀️ by RinggAI Team
387
+ """)
388
+
389
+ return demo
390
+
391
+
392
+ # Launch the app
393
+ if __name__ == "__main__":
394
+ print("🌐 Launching Ringg STT V0 Gradio Interface...")
395
+ demo = create_interface()
396
+ demo.launch()