# STT WebSocket Service v1.0.0

Standalone WebSocket-only Speech-to-Text service for VoiceCal integration.

## Features

- ✅ WebSocket-only STT interface (`/ws/stt`)
- ✅ ZeroGPU Whisper integration
- ✅ FastAPI-based architecture
- ✅ No Gradio dependencies
- ✅ No MCP dependencies
- ✅ Standalone deployment ready
- ✅ Real-time audio transcription
- ✅ Base64 audio transmission
- ✅ Multiple Whisper model sizes

## Quick Start

### Using the WebSocket Server

```bash
# Install dependencies
pip install -r requirements-websocket.txt

# Run standalone WebSocket server
python3 websocket_stt_server.py
```

### Docker Deployment

```bash
# Build WebSocket-only image
docker build -f Dockerfile-websocket -t stt-websocket-service .

# Run container
docker run -p 7860:7860 stt-websocket-service
```

## API Endpoints

### WebSocket: `/ws/stt`

**Connection Confirmation:**
```json
{
  "type": "stt_connection_confirmed",
  "client_id": "uuid",
  "service": "STT WebSocket Service",
  "version": "1.0.0",
  "model": "whisper-base",
  "device": "cuda",
  "message": "STT WebSocket connected and ready"
}
```

**Send Audio for Transcription:**
```json
{
  "type": "stt_audio_chunk",
  "audio_data": "base64_encoded_webm_audio",
  "language": "auto",
  "model_size": "base"
}
```

**Transcription Result:**
```json
{
  "type": "stt_transcription_complete",
  "client_id": "uuid",
  "transcription": "Hello world",
  "timing": {
    "processing_time": 1.23,
    "model_size": "base",
    "device": "cuda"
  },
  "status": "success"
}
```

### HTTP: `/health`

```json
{
  "service": "STT WebSocket Service",
  "version": "1.0.0",
  "status": "healthy",
  "model_loaded": true,
  "active_connections": 2,
  "device": "cuda"
}
```

## Port Configuration

- **Default Port**: `7860`
- **WebSocket Endpoint**: `ws://localhost:7860/ws/stt`
- **Health Check**: `http://localhost:7860/health`

## Architecture

This service eliminates all unnecessary dependencies:
- **Removed**: Gradio web interface
- **Removed**: MCP protocol support  
- **Removed**: Complex routing
- **Added**: Direct FastAPI WebSocket endpoints
- **Added**: Simplified audio processing
- **Added**: ZeroGPU optimized transcription

## Integration

Connect from VoiceCal WebRTC interface:

```javascript
const ws = new WebSocket('ws://localhost:7860/ws/stt');

// Send audio data
ws.send(JSON.stringify({
  type: "stt_audio_chunk",
  audio_data: base64AudioData,
  language: "auto",
  model_size: "base"
}));
```