andito's picture
andito HF Staff
Restructure for HuggingFace Space compatibility
0f739b8
# Quick Start Guide
## Running the Demo Locally
1. **Install dependencies** (already done):
```bash
cd parakeet-web-demo
npm install
```
2. **Start development server**:
```bash
npm run dev
```
3. **Open browser**:
- Navigate to: http://localhost:3000
- Use a WebGPU-compatible browser (Chrome 113+ or Edge 113+)
4. **Use the demo**:
- Click "Load Model" (downloads ~2GB ONNX model, one-time only)
- Wait for model to load (30s-2min depending on connection)
- Click "Start Recording" and grant microphone permissions
- Speak and watch real-time progressive transcriptions!
- Click "Stop Recording" when done
## What You'll See
### Color-Coded Transcription
- **Yellow text**: Fixed sentences (completed, locked, won't change)
- **Cyan text**: Active transcription (in-progress, updating in real-time)
### Performance Metrics
- **Latency**: Time to process audio chunk
- **RTF (Real-time Factor)**: Processing speed vs audio duration
- <1.0 = faster than real-time ✓
- >1.0 = slower than real-time ⚠️
- **Window State**:
- "growing" (0-15s): Accumulating audio for accuracy
- "sliding" (>15s): Smart sentence-aware windowing
## Browser Requirements
### ✅ Full Support (WebGPU)
- Chrome 113+
- Edge 113+
### ⚠️ CPU Fallback
- Firefox (no WebGPU yet)
- Safari (limited support)
Check your browser: https://caniuse.com/webgpu
## Troubleshooting
### Model won't load
- Check internet connection (2GB download)
- Try refreshing the page
- Check browser console for errors
### No microphone access
- Grant microphone permissions when prompted
- Check browser settings (Settings → Privacy → Microphone)
### Slow performance
- Use Chrome or Edge with WebGPU support
- Close other tabs to free memory
- Check performance metrics - RTF should be <1.0
### "Failed to start recording"
- Ensure microphone is connected
- Try using headphones with built-in mic
- Check if another app is using the microphone
## Building for Production
```bash
npm run build
npm run preview
```
The build output will be in `dist/` folder.
## Next Steps
- Read the full [README.md](README.md) for technical details
- Check the implementation plan: [../../../.claude/plans/validated-hugging-book.md](../../../.claude/plans/validated-hugging-book.md)
- Compare with Python implementation: [../STT/smart_progressive_streaming.py](../STT/smart_progressive_streaming.py)
## Key Files
- `src/App.jsx` - Main application component
- `src/worker.js` - Web Worker for model inference
- `src/utils/progressive-streaming.js` - Smart streaming algorithm (ported from Python)
- `src/utils/audio.js` - Microphone capture and audio processing
- `src/components/TranscriptionDisplay.jsx` - Live transcription UI
- `src/components/PerformanceMetrics.jsx` - Developer metrics dashboard
Enjoy the demo! 🎤