File size: 2,846 Bytes
b830719
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
# Quick Start Guide

## Running the Demo Locally

1. **Install dependencies** (already done):
   ```bash
   cd parakeet-web-demo
   npm install
   ```

2. **Start development server**:
   ```bash
   npm run dev
   ```

3. **Open browser**:
   - Navigate to: http://localhost:3000
   - Use a WebGPU-compatible browser (Chrome 113+ or Edge 113+)

4. **Use the demo**:
   - Click "Load Model" (downloads ~2GB ONNX model, one-time only)
   - Wait for model to load (30s-2min depending on connection)
   - Click "Start Recording" and grant microphone permissions
   - Speak and watch real-time progressive transcriptions!
   - Click "Stop Recording" when done

## What You'll See

### Color-Coded Transcription
- **Yellow text**: Fixed sentences (completed, locked, won't change)
- **Cyan text**: Active transcription (in-progress, updating in real-time)

### Performance Metrics
- **Latency**: Time to process audio chunk
- **RTF (Real-time Factor)**: Processing speed vs audio duration
  - <1.0 = faster than real-time ✓
  - >1.0 = slower than real-time ⚠️
- **Window State**:
  - "growing" (0-15s): Accumulating audio for accuracy
  - "sliding" (>15s): Smart sentence-aware windowing

## Browser Requirements

### ✅ Full Support (WebGPU)
- Chrome 113+
- Edge 113+

### ⚠️ CPU Fallback
- Firefox (no WebGPU yet)
- Safari (limited support)

Check your browser: https://caniuse.com/webgpu

## Troubleshooting

### Model won't load
- Check internet connection (2GB download)
- Try refreshing the page
- Check browser console for errors

### No microphone access
- Grant microphone permissions when prompted
- Check browser settings (Settings → Privacy → Microphone)

### Slow performance
- Use Chrome or Edge with WebGPU support
- Close other tabs to free memory
- Check performance metrics - RTF should be <1.0

### "Failed to start recording"
- Ensure microphone is connected
- Try using headphones with built-in mic
- Check if another app is using the microphone

## Building for Production

```bash
npm run build
npm run preview
```

The build output will be in `dist/` folder.

## Next Steps

- Read the full [README.md](README.md) for technical details
- Check the implementation plan: [../../../.claude/plans/validated-hugging-book.md](../../../.claude/plans/validated-hugging-book.md)
- Compare with Python implementation: [../STT/smart_progressive_streaming.py](../STT/smart_progressive_streaming.py)

## Key Files

- `src/App.jsx` - Main application component
- `src/worker.js` - Web Worker for model inference
- `src/utils/progressive-streaming.js` - Smart streaming algorithm (ported from Python)
- `src/utils/audio.js` - Microphone capture and audio processing
- `src/components/TranscriptionDisplay.jsx` - Live transcription UI
- `src/components/PerformanceMetrics.jsx` - Developer metrics dashboard

Enjoy the demo! 🎤