Spaces:

danielrosehill
/

Claude-Code-Slash-Commands-Linux-Desktop

Running

App Files Files Community

Claude-Code-Slash-Commands-Linux-Desktop / commands /ai-tools /setup-speech-to-text.md

danielrosehill

commit

279efce 2 months ago

preview code

raw

history blame contribute delete

2.73 kB

	---
	description: Check installed STT apps and suggest installations including local Whisper
	tags: [ai, stt, whisper, speech-recognition, audio, project, gitignored]
	---

	You are helping the user set up speech-to-text applications including local Whisper.

	## Process

	1. Check currently installed STT apps
	- System packages: `dpkg -l \| grep -E "whisper\|speech\|voice"`
	- Python packages: `pip list \| grep -E "whisper\|speech\|vosk"`
	- Check `~/programs/ai-ml/` for installed apps

	2. Suggest STT installation candidates

	Whisper (OpenAI) - Recommended:
	- Best quality, local inference
	- Multiple model sizes available
	- Multilingual support

	Other options:
	- Vosk - Lightweight, offline
	- Coqui STT - Mozilla's solution
	- SpeechNote - Simple GUI
	- Subtitle Edit - Video subtitling
	- Subtld - Automatic subtitles

	3. Install Whisper (local)

	Method 1: Using pip (simple)
	```bash
	pip install openai-whisper
	```

	Method 2: Using conda (recommended)
	```bash
	conda create -n whisper python=3.11 -y
	conda activate whisper
	pip install openai-whisper
	```

	Install dependencies:
	```bash
	# For audio processing
	sudo apt install ffmpeg
	pip install setuptools-rust
	```

	4. Install faster-whisper (optimized)
	```bash
	pip install faster-whisper
	```
	- Uses CTranslate2 for faster inference
	- Lower VRAM usage

	5. Install WhisperX (advanced)
	```bash
	pip install whisperx
	```
	- Includes alignment and diarization
	- Better timestamps

	6. Download Whisper models
	- Models are downloaded automatically on first use
	- Sizes: tiny, base, small, medium, large
	- Suggest based on VRAM:
	- < 4GB: tiny or base
	- 4-8GB: small or medium
	- 8GB+: large

	7. Test installation
	```bash
	whisper audio.mp3 --model base --language en
	```

	8. Install GUI options

	Whisper Desktop:
	- Check if available as AppImage or Flatpak

	Subtitle Edit:
	```bash
	sudo apt install subtitleeditor
	```

	Custom GUI:
	- Suggest installing gradio-based Whisper UIs

	9. Create helper script
	- Offer to create `~/scripts/transcribe.sh`:
	```bash
	#!/bin/bash
	whisper "$1" --model medium --language en --output_format txt
	```

	10. Suggest workflows
	- Real-time transcription
	- Batch processing
	- Video subtitling
	- Meeting transcription

	## Output

	Provide a summary showing:
	- Currently installed STT applications
	- Whisper installation status and model sizes
	- GPU acceleration status
	- Suggested models based on hardware
	- Example commands for transcription
	- GUI options available
	- Helper scripts created