Spaces:

Rahul2020
/

EAG_s9_sample

Sleeping

Rahuluni commited on Oct 9, 2025

Commit

1095508

1 Parent(s): 86fcf07

add eng only

Files changed (2) hide show

README.md CHANGED Viewed

@@ -4,14 +4,38 @@ sdk: gradio
 emoji: 🚀
 colorFrom: red
 ---
-# Whisper-Small Speech-to-Text (Gradio)
-Drop these files into a new Hugging Face Space (Gradio template):
-- app.py
-- requirements.txt
-The app uses `openai/whisper-small` via Hugging Face Transformers pipeline for CPU-friendly offline transcription.
 ## Usage
 - Click the microphone recorder to record or upload an audio file.
-- Click **Transcribe** to get the text.

 emoji: 🚀
 colorFrom: red
 ---
+---
+license: apache-2.0
+sdk: gradio
+emoji: 🚀
+colorFrom: red
+---
+# Whisper-Small Speech-to-English (Gradio)
+Drop these files into a Hugging Face Space (Gradio template):
+- `app.py`
+- `requirements.txt`
+This app uses `openai/whisper-small` in translate mode to convert spoken audio into English text (Whisper's `translate` task). The model runs CPU-only by default and is suitable for small/medium audio files.
 ## Usage
 - Click the microphone recorder to record or upload an audio file.
+- Click **Transcribe** to get English text output (the app translates input speech into English).
+## Debug
+Set `DEBUG = True` in `app.py` to enable logging and save resampled WAVs (written to your system temp directory) for inspection.
+## Run locally
+```powershell
+# Windows PowerShell
+python -m venv venv_hf
+venv_hf\Scripts\Activate.ps1
+pip install -r requirements.txt
+python app.py
+```
+Open the Gradio URL shown in the console (usually http://0.0.0.0:7860).
+## Notes
+- The `openai/whisper-small` model runs on CPU and may take time for longer files.
+- For other target languages or lower latency consider using the Hugging Face Inference API or a separate text translation pipeline.

app.py CHANGED Viewed

@@ -14,7 +14,14 @@ from transformers import pipeline
 # The model "openai/whisper-small" is public and works on CPU (smaller memory footprint).
 # Loading may take a few seconds at startup.
 ASR_MODEL = "openai/whisper-small"
-asr = pipeline("automatic-speech-recognition", model=ASR_MODEL, chunk_length_s=30, ignore_warning=True)
 # Debug flag: set True to print audio shapes/dtypes and save resampled temp WAVs
 DEBUG = False
@@ -183,12 +190,13 @@ def transcribe(audio):
 def clear_audio():
     return None, ""
-with gr.Blocks(title="Whisper Tiny Speech-to-Text (Free on HF Spaces)") as demo:
     gr.Markdown(
         """
-        # 🎙️ Whisper-Small Speech-to-Text
-        Record or upload audio and click **Transcribe**.
-        Uses the `openai/whisper-small` model (runs CPU-only).
         """
     )
@@ -227,14 +235,14 @@ with gr.Blocks(title="Whisper Tiny Speech-to-Text (Free on HF Spaces)") as demo:
     # Copy transcript to clipboard (Gradio has `copy` action for buttons)
     copy_btn.click(
-    fn=lambda txt: txt,
-    inputs=transcript,
-        outputs=None
     )
     gr.Markdown(
-        "Notes: Small model runs on CPU but will still take a bit of time for longer files. "
-        "If you need translation to English or better latency, consider smaller models or the HF Inference API."
     )
 if __name__ == "__main__":

 # The model "openai/whisper-small" is public and works on CPU (smaller memory footprint).
 # Loading may take a few seconds at startup.
 ASR_MODEL = "openai/whisper-small"
+# Use Whisper's translate task so output is English regardless of input language
+asr = pipeline(
+    "automatic-speech-recognition",
+    model=ASR_MODEL,
+    chunk_length_s=30,
+    ignore_warning=True,
+    generate_kwargs={"task": "translate"},
+)
 # Debug flag: set True to print audio shapes/dtypes and save resampled temp WAVs
 DEBUG = False
 def clear_audio():
     return None, ""
+with gr.Blocks(title="Whisper-Small Speech-to-English") as demo:
     gr.Markdown(
         """
+        # 🎙️ Whisper-Small Speech-to-English
+        Record or upload audio and click **Transcribe**.
+        This app uses `openai/whisper-small` in translate mode and returns English text.
         """
     )
     # Copy transcript to clipboard (Gradio has `copy` action for buttons)
     copy_btn.click(
+        fn=lambda txt: txt,
+        inputs=transcript,
+        outputs=None,
     )
     gr.Markdown(
+        "Notes: The app translates spoken audio to English using Whisper (translate task). "
+        "Small model runs on CPU and may take time for longer files. For lower latency or other target languages, consider the HF Inference API or additional translation pipelines."
     )
 if __name__ == "__main__":