Spaces:

andito
/

parakeet-v3-streaming

Running

App Files Files Community

parakeet-v3-streaming / source /QUICKSTART.md

andito HF Staff

Restructure for HuggingFace Space compatibility

0f739b8 13 days ago

preview code

raw

history blame contribute delete

2.85 kB

	# Quick Start Guide

	## Running the Demo Locally

	1. Install dependencies (already done):
	```bash
	cd parakeet-web-demo
	npm install
	```

	2. Start development server:
	```bash
	npm run dev
	```

	3. Open browser:
	- Navigate to: http://localhost:3000
	- Use a WebGPU-compatible browser (Chrome 113+ or Edge 113+)

	4. Use the demo:
	- Click "Load Model" (downloads ~2GB ONNX model, one-time only)
	- Wait for model to load (30s-2min depending on connection)
	- Click "Start Recording" and grant microphone permissions
	- Speak and watch real-time progressive transcriptions!
	- Click "Stop Recording" when done

	## What You'll See

	### Color-Coded Transcription
	- Yellow text: Fixed sentences (completed, locked, won't change)
	- Cyan text: Active transcription (in-progress, updating in real-time)

	### Performance Metrics
	- Latency: Time to process audio chunk
	- RTF (Real-time Factor): Processing speed vs audio duration
	- <1.0 = faster than real-time ✓
	- >1.0 = slower than real-time ⚠️
	- Window State:
	- "growing" (0-15s): Accumulating audio for accuracy
	- "sliding" (>15s): Smart sentence-aware windowing

	## Browser Requirements

	### ✅ Full Support (WebGPU)
	- Chrome 113+
	- Edge 113+

	### ⚠️ CPU Fallback
	- Firefox (no WebGPU yet)
	- Safari (limited support)

	Check your browser: https://caniuse.com/webgpu

	## Troubleshooting

	### Model won't load
	- Check internet connection (2GB download)
	- Try refreshing the page
	- Check browser console for errors

	### No microphone access
	- Grant microphone permissions when prompted
	- Check browser settings (Settings → Privacy → Microphone)

	### Slow performance
	- Use Chrome or Edge with WebGPU support
	- Close other tabs to free memory
	- Check performance metrics - RTF should be <1.0

	### "Failed to start recording"
	- Ensure microphone is connected
	- Try using headphones with built-in mic
	- Check if another app is using the microphone

	## Building for Production

	```bash
	npm run build
	npm run preview
	```

	The build output will be in `dist/` folder.

	## Next Steps

	- Read the full [README.md](README.md) for technical details
	- Check the implementation plan: [../../../.claude/plans/validated-hugging-book.md](../../../.claude/plans/validated-hugging-book.md)
	- Compare with Python implementation: [../STT/smart_progressive_streaming.py](../STT/smart_progressive_streaming.py)

	## Key Files

	- `src/App.jsx` - Main application component
	- `src/worker.js` - Web Worker for model inference
	- `src/utils/progressive-streaming.js` - Smart streaming algorithm (ported from Python)
	- `src/utils/audio.js` - Microphone capture and audio processing
	- `src/components/TranscriptionDisplay.jsx` - Live transcription UI
	- `src/components/PerformanceMetrics.jsx` - Developer metrics dashboard

	Enjoy the demo! 🎤