Spaces:

poompengcharoen
/

typhoon-asr-api

Sleeping

App Files Files Community

poompengcharoen commited on Oct 7, 2025

Commit

0002e81

1 Parent(s): f04f941

Initial commit

Browse files

Files changed (3) hide show

README.md +42 -5
app.py +69 -0
requirements.txt +2 -0

README.md CHANGED Viewed

@@ -1,12 +1,49 @@
 ---
-title: Typhoon Asr Api
-emoji: 👁
-colorFrom: indigo
-colorTo: blue
 sdk: gradio
 sdk_version: 5.49.0
 app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Typhoon ASR API
+emoji: 🎤
+colorFrom: blue
+colorTo: purple
 sdk: gradio
 sdk_version: 5.49.0
 app_file: app.py
 pinned: false
 ---
+# Typhoon ASR Real-Time API
+This Space provides a free API for Thai speech recognition using the Typhoon ASR Real-Time model.
+## Features
+- 🎯 **Real-time Thai transcription**
+- ⏱️ **Word-level timestamps**
+- 🎤 **Microphone input support**
+- 📁 **File upload support**
+- 🔄 **API endpoint for external calls**
+## Usage
+1. **Upload an audio file** or **record directly**
+2. **Click "Transcribe"** to get Thai transcription
+3. **View results** with word-level timestamps
+## API Endpoint
+This Space provides an API endpoint that can be called from external applications:
+```
+POST https://YOUR_USERNAME-typhoon-asr-api.hf.space/api/predict
+```
+## Supported Audio Formats
+- WAV, MP3, FLAC, OGG, OPUS
+- Any audio format supported by the Typhoon ASR model
+## Model Information
+- **Model:** Typhoon ASR Real-Time
+- **Language:** Thai
+- **Architecture:** FastConformer-Transducer
+- **Performance:** 4097x real-time processing speed
+- **Accuracy:** CER 0.0984

app.py ADDED Viewed

	@@ -0,0 +1,69 @@

+import gradio as gr
+from typhoon_asr import transcribe
+import tempfile
+import os
+def transcribe_audio(audio_file):
+    """Transcribe audio file using Typhoon ASR"""
+    if audio_file is None:
+        return "Please upload an audio file"
+    try:
+        # Transcribe using Typhoon ASR
+        result = transcribe(audio_file, with_timestamps=True)
+        # Format the result
+        text = result['text']
+        timestamps = result.get('timestamps', [])
+        # Create formatted output
+        output = f"**Transcription:**\n{text}\n\n"
+        if timestamps:
+            output += "**Word-level Timestamps:**\n"
+            for ts in timestamps:
+                output += f"[{ts['start']:.2f}s - {ts['end']:.2f}s] {ts['word']}\n"
+        return output
+    except Exception as e:
+        return f"Error: {str(e)}"
+# Create Gradio interface
+with gr.Blocks(title="Typhoon ASR API") as demo:
+    gr.Markdown("# 🎤 Typhoon ASR Real-Time Transcription")
+    gr.Markdown("Upload an audio file to get Thai speech transcription with word-level timestamps")
+    with gr.Row():
+        with gr.Column():
+            audio_input = gr.Audio(
+                label="Upload Audio File",
+                type="filepath",
+                sources=["upload", "microphone"]
+            )
+            transcribe_btn = gr.Button("🎯 Transcribe", variant="primary", size="lg")
+        with gr.Column():
+            output = gr.Markdown(label="Transcription Result")
+    # Connect the button to the function
+    transcribe_btn.click(
+        fn=transcribe_audio,
+        inputs=[audio_input],
+        outputs=[output]
+    )
+    # Add examples
+    gr.Examples(
+        examples=[],
+        inputs=[audio_input],
+        label="Example audio files (upload your own)"
+    )
+# For API access - this function can be called externally
+def api_transcribe(audio_file_path):
+    """API endpoint for external calls"""
+    return transcribe_audio(audio_file_path)
+if __name__ == "__main__":
+    demo.launch()

requirements.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ typhoon-asr
2	+ gradio>=4.0.0