andito's picture
andito HF Staff
Restructure for HuggingFace Space compatibility
0f739b8

Quick Start Guide

Running the Demo Locally

  1. Install dependencies (already done):

    cd parakeet-web-demo
    npm install
    
  2. Start development server:

    npm run dev
    
  3. Open browser:

  4. Use the demo:

    • Click "Load Model" (downloads ~2GB ONNX model, one-time only)
    • Wait for model to load (30s-2min depending on connection)
    • Click "Start Recording" and grant microphone permissions
    • Speak and watch real-time progressive transcriptions!
    • Click "Stop Recording" when done

What You'll See

Color-Coded Transcription

  • Yellow text: Fixed sentences (completed, locked, won't change)
  • Cyan text: Active transcription (in-progress, updating in real-time)

Performance Metrics

  • Latency: Time to process audio chunk
  • RTF (Real-time Factor): Processing speed vs audio duration
    • <1.0 = faster than real-time ✓
    • 1.0 = slower than real-time ⚠️

  • Window State:
    • "growing" (0-15s): Accumulating audio for accuracy
    • "sliding" (>15s): Smart sentence-aware windowing

Browser Requirements

✅ Full Support (WebGPU)

  • Chrome 113+
  • Edge 113+

⚠️ CPU Fallback

  • Firefox (no WebGPU yet)
  • Safari (limited support)

Check your browser: https://caniuse.com/webgpu

Troubleshooting

Model won't load

  • Check internet connection (2GB download)
  • Try refreshing the page
  • Check browser console for errors

No microphone access

  • Grant microphone permissions when prompted
  • Check browser settings (Settings → Privacy → Microphone)

Slow performance

  • Use Chrome or Edge with WebGPU support
  • Close other tabs to free memory
  • Check performance metrics - RTF should be <1.0

"Failed to start recording"

  • Ensure microphone is connected
  • Try using headphones with built-in mic
  • Check if another app is using the microphone

Building for Production

npm run build
npm run preview

The build output will be in dist/ folder.

Next Steps

Key Files

  • src/App.jsx - Main application component
  • src/worker.js - Web Worker for model inference
  • src/utils/progressive-streaming.js - Smart streaming algorithm (ported from Python)
  • src/utils/audio.js - Microphone capture and audio processing
  • src/components/TranscriptionDisplay.jsx - Live transcription UI
  • src/components/PerformanceMetrics.jsx - Developer metrics dashboard

Enjoy the demo! 🎤