Spaces:

danielrosehill
/

Claude-Code-Slash-Commands

Sleeping

App Files Files Community

Claude-Code-Slash-Commands / commands /sysadmin /linux-desktop /ai-setup /setup-speech-to-text.md

danielrosehill

Redesign interface with accordion cards and category pills

292d92c 2 months ago

preview code

raw

history blame contribute delete

2.73 kB

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

metadata

description: Check installed STT apps and suggest installations including local Whisper
tags:
  - ai
  - stt
  - whisper
  - speech-recognition
  - audio
  - project
  - gitignored

You are helping the user set up speech-to-text applications including local Whisper.

Process

Check currently installed STT apps
- System packages: dpkg -l | grep -E "whisper|speech|voice"
- Python packages: pip list | grep -E "whisper|speech|vosk"
- Check ~/programs/ai-ml/ for installed apps
Suggest STT installation candidates

Whisper (OpenAI) - Recommended:
- Best quality, local inference
- Multiple model sizes available
- Multilingual support
Other options:
- Vosk - Lightweight, offline
- Coqui STT - Mozilla's solution
- SpeechNote - Simple GUI
- Subtitle Edit - Video subtitling
- Subtld - Automatic subtitles

Install Whisper (local)

Method 1: Using pip (simple)

pip install openai-whisper

Method 2: Using conda (recommended)

conda create -n whisper python=3.11 -y
conda activate whisper
pip install openai-whisper

Install dependencies:

# For audio processing
sudo apt install ffmpeg
pip install setuptools-rust

Install faster-whisper (optimized)
```
pip install faster-whisper
```
- Uses CTranslate2 for faster inference
- Lower VRAM usage
Install WhisperX (advanced)
```
pip install whisperx
```
- Includes alignment and diarization
- Better timestamps
Download Whisper models
- Models are downloaded automatically on first use
- Sizes: tiny, base, small, medium, large
- Suggest based on VRAM:
  - < 4GB: tiny or base
  - 4-8GB: small or medium
  - 8GB+: large

Test installation

whisper audio.mp3 --model base --language en

Install GUI options

Whisper Desktop:
- Check if available as AppImage or Flatpak
Subtitle Edit:
```
sudo apt install subtitleeditor
```
Custom GUI:
- Suggest installing gradio-based Whisper UIs

Create helper script

Offer to create ~/scripts/transcribe.sh:

#!/bin/bash
whisper "$1" --model medium --language en --output_format txt

Suggest workflows

Real-time transcription
Batch processing
Video subtitling
Meeting transcription

Output

Provide a summary showing:

Currently installed STT applications
Whisper installation status and model sizes
GPU acceleration status
Suggested models based on hardware
Example commands for transcription
GUI options available
Helper scripts created