File size: 1,246 Bytes
e71a934
 
 
 
 
 
1095508
 
 
 
 
 
 
7cb2aba
1095508
 
 
822cbeb
1095508
822cbeb
 
 
1095508
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---
license: apache-2.0
sdk: gradio
emoji: ๐Ÿš€
colorFrom: red
---
---
license: apache-2.0
sdk: gradio
emoji: ๐Ÿš€
colorFrom: red
---
# Whisper-Small Speech-to-English (Gradio)

Drop these files into a Hugging Face Space (Gradio template):
- `app.py`
- `requirements.txt`

This app uses `openai/whisper-small` in translate mode to convert spoken audio into English text (Whisper's `translate` task). The model runs CPU-only by default and is suitable for small/medium audio files.

## Usage
- Click the microphone recorder to record or upload an audio file.
- Click **Transcribe** to get English text output (the app translates input speech into English).

## Debug
Set `DEBUG = True` in `app.py` to enable logging and save resampled WAVs (written to your system temp directory) for inspection.

## Run locally
```powershell
# Windows PowerShell
python -m venv venv_hf
venv_hf\Scripts\Activate.ps1
pip install -r requirements.txt
python app.py
```

Open the Gradio URL shown in the console (usually http://0.0.0.0:7860).

## Notes
- The `openai/whisper-small` model runs on CPU and may take time for longer files.
- For other target languages or lower latency consider using the Hugging Face Inference API or a separate text translation pipeline.