Spaces:
Running
Running
| # Quick Start Guide | |
| ## Running the Demo Locally | |
| 1. **Install dependencies** (already done): | |
| ```bash | |
| cd parakeet-web-demo | |
| npm install | |
| ``` | |
| 2. **Start development server**: | |
| ```bash | |
| npm run dev | |
| ``` | |
| 3. **Open browser**: | |
| - Navigate to: http://localhost:3000 | |
| - Use a WebGPU-compatible browser (Chrome 113+ or Edge 113+) | |
| 4. **Use the demo**: | |
| - Click "Load Model" (downloads ~2GB ONNX model, one-time only) | |
| - Wait for model to load (30s-2min depending on connection) | |
| - Click "Start Recording" and grant microphone permissions | |
| - Speak and watch real-time progressive transcriptions! | |
| - Click "Stop Recording" when done | |
| ## What You'll See | |
| ### Color-Coded Transcription | |
| - **Yellow text**: Fixed sentences (completed, locked, won't change) | |
| - **Cyan text**: Active transcription (in-progress, updating in real-time) | |
| ### Performance Metrics | |
| - **Latency**: Time to process audio chunk | |
| - **RTF (Real-time Factor)**: Processing speed vs audio duration | |
| - <1.0 = faster than real-time ✓ | |
| - >1.0 = slower than real-time ⚠️ | |
| - **Window State**: | |
| - "growing" (0-15s): Accumulating audio for accuracy | |
| - "sliding" (>15s): Smart sentence-aware windowing | |
| ## Browser Requirements | |
| ### ✅ Full Support (WebGPU) | |
| - Chrome 113+ | |
| - Edge 113+ | |
| ### ⚠️ CPU Fallback | |
| - Firefox (no WebGPU yet) | |
| - Safari (limited support) | |
| Check your browser: https://caniuse.com/webgpu | |
| ## Troubleshooting | |
| ### Model won't load | |
| - Check internet connection (2GB download) | |
| - Try refreshing the page | |
| - Check browser console for errors | |
| ### No microphone access | |
| - Grant microphone permissions when prompted | |
| - Check browser settings (Settings → Privacy → Microphone) | |
| ### Slow performance | |
| - Use Chrome or Edge with WebGPU support | |
| - Close other tabs to free memory | |
| - Check performance metrics - RTF should be <1.0 | |
| ### "Failed to start recording" | |
| - Ensure microphone is connected | |
| - Try using headphones with built-in mic | |
| - Check if another app is using the microphone | |
| ## Building for Production | |
| ```bash | |
| npm run build | |
| npm run preview | |
| ``` | |
| The build output will be in `dist/` folder. | |
| ## Next Steps | |
| - Read the full [README.md](README.md) for technical details | |
| - Check the implementation plan: [../../../.claude/plans/validated-hugging-book.md](../../../.claude/plans/validated-hugging-book.md) | |
| - Compare with Python implementation: [../STT/smart_progressive_streaming.py](../STT/smart_progressive_streaming.py) | |
| ## Key Files | |
| - `src/App.jsx` - Main application component | |
| - `src/worker.js` - Web Worker for model inference | |
| - `src/utils/progressive-streaming.js` - Smart streaming algorithm (ported from Python) | |
| - `src/utils/audio.js` - Microphone capture and audio processing | |
| - `src/components/TranscriptionDisplay.jsx` - Live transcription UI | |
| - `src/components/PerformanceMetrics.jsx` - Developer metrics dashboard | |
| Enjoy the demo! 🎤 | |