Spaces:
Running
Running
Quick Start Guide
Running the Demo Locally
Install dependencies (already done):
cd parakeet-web-demo npm installStart development server:
npm run devOpen browser:
- Navigate to: http://localhost:3000
- Use a WebGPU-compatible browser (Chrome 113+ or Edge 113+)
Use the demo:
- Click "Load Model" (downloads ~2GB ONNX model, one-time only)
- Wait for model to load (30s-2min depending on connection)
- Click "Start Recording" and grant microphone permissions
- Speak and watch real-time progressive transcriptions!
- Click "Stop Recording" when done
What You'll See
Color-Coded Transcription
- Yellow text: Fixed sentences (completed, locked, won't change)
- Cyan text: Active transcription (in-progress, updating in real-time)
Performance Metrics
- Latency: Time to process audio chunk
- RTF (Real-time Factor): Processing speed vs audio duration
- <1.0 = faster than real-time ✓
1.0 = slower than real-time ⚠️
- Window State:
- "growing" (0-15s): Accumulating audio for accuracy
- "sliding" (>15s): Smart sentence-aware windowing
Browser Requirements
✅ Full Support (WebGPU)
- Chrome 113+
- Edge 113+
⚠️ CPU Fallback
- Firefox (no WebGPU yet)
- Safari (limited support)
Check your browser: https://caniuse.com/webgpu
Troubleshooting
Model won't load
- Check internet connection (2GB download)
- Try refreshing the page
- Check browser console for errors
No microphone access
- Grant microphone permissions when prompted
- Check browser settings (Settings → Privacy → Microphone)
Slow performance
- Use Chrome or Edge with WebGPU support
- Close other tabs to free memory
- Check performance metrics - RTF should be <1.0
"Failed to start recording"
- Ensure microphone is connected
- Try using headphones with built-in mic
- Check if another app is using the microphone
Building for Production
npm run build
npm run preview
The build output will be in dist/ folder.
Next Steps
- Read the full README.md for technical details
- Check the implementation plan: ../../../.claude/plans/validated-hugging-book.md
- Compare with Python implementation: ../STT/smart_progressive_streaming.py
Key Files
src/App.jsx- Main application componentsrc/worker.js- Web Worker for model inferencesrc/utils/progressive-streaming.js- Smart streaming algorithm (ported from Python)src/utils/audio.js- Microphone capture and audio processingsrc/components/TranscriptionDisplay.jsx- Live transcription UIsrc/components/PerformanceMetrics.jsx- Developer metrics dashboard
Enjoy the demo! 🎤