Voice_backend / API_DOCUMENTATION.md
Mohansai2004's picture
Upload 66 files
9838866 verified
# Voice-to-Voice Translator API Documentation
## πŸ“‹ Table of Contents
- [Overview](#overview)
- [Base URL](#base-url)
- [REST API Endpoints](#rest-api-endpoints)
- [Authentication](#authentication)
- [WebSocket Connection](#websocket-connection)
- [Message Protocol](#message-protocol)
- [Message Types](#message-types)
- [Error Handling](#error-handling)
- [Rate Limits](#rate-limits)
- [Code Examples](#code-examples)
---
## 🎯 Overview
The Voice-to-Voice Translator API provides real-time audio translation capabilities through WebSocket connections. Users can join translation rooms and receive live translations of audio streams.
**Key Features:**
- Real-time bidirectional audio translation
- Multi-room support
- Multiple language pairs
- Low-latency streaming
- JWT authentication (optional)
- Rate limiting and connection management
---
## 🌐 Base URL
### Development
```
ws://localhost:8000/ws
```
### Production
```
wss://your-domain.com/ws
```
---
## οΏ½ REST API Endpoints
The API provides several REST endpoints for management and information retrieval.
### Base URL for REST API
**Development:** `http://localhost:8000`
**Production:** `https://your-domain.com`
---
### 1. Health Check
Get server health status.
**Endpoint:** `GET /health`
**Authentication:** None required
**Response:**
```json
{
"status": "healthy",
"version": "1.0.0",
"uptime": 3600,
"connections": 15,
"rooms": 3,
"timestamp": "2025-12-17T10:30:00Z"
}
```
**Status Codes:**
- `200 OK` - Server is healthy
- `503 Service Unavailable` - Server is unhealthy
**cURL Example:**
```bash
curl http://localhost:8000/health
```
---
### 2. Create Authentication Token
Generate a JWT token for WebSocket authentication.
**Endpoint:** `POST /auth/token`
**Authentication:** API Key (optional)
**Headers:**
```
Content-Type: application/json
X-API-Key: your-api-key (optional)
```
**Request Body:**
```json
{
"user_id": "user123",
"name": "John Doe",
"metadata": {
"email": "john@example.com"
}
}
```
**Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `user_id` | string | Yes | Unique user identifier |
| `name` | string | Yes | User display name |
| `metadata` | object | No | Additional user metadata |
**Response:**
```json
{
"access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"token_type": "bearer",
"expires_in": 3600,
"user_id": "user123"
}
```
**Status Codes:**
- `200 OK` - Token created successfully
- `400 Bad Request` - Invalid request body
- `401 Unauthorized` - Invalid API key
- `429 Too Many Requests` - Rate limit exceeded
**cURL Example:**
```bash
curl -X POST http://localhost:8000/auth/token \
-H "Content-Type: application/json" \
-H "X-API-Key: your-api-key" \
-d '{
"user_id": "user123",
"name": "John Doe"
}'
```
---
### 3. Verify Token
Verify a JWT token's validity.
**Endpoint:** `POST /auth/verify`
**Authentication:** None required
**Headers:**
```
Content-Type: application/json
```
**Request Body:**
```json
{
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}
```
**Response:**
```json
{
"valid": true,
"user_id": "user123",
"expires_at": "2025-12-17T11:30:00Z"
}
```
**Status Codes:**
- `200 OK` - Token is valid
- `401 Unauthorized` - Token is invalid or expired
**cURL Example:**
```bash
curl -X POST http://localhost:8000/auth/verify \
-H "Content-Type: application/json" \
-d '{
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}'
```
---
### 4. Get Supported Languages
Retrieve list of supported languages.
**Endpoint:** `GET /languages`
**Authentication:** None required
**Response:**
```json
{
"languages": [
{
"code": "en",
"name": "English",
"stt_available": true,
"translation_available": true,
"tts_available": true
},
{
"code": "es",
"name": "Spanish",
"stt_available": true,
"translation_available": true,
"tts_available": true
},
{
"code": "fr",
"name": "French",
"stt_available": true,
"translation_available": true,
"tts_available": true
}
],
"total": 9
}
```
**Status Codes:**
- `200 OK` - Languages retrieved successfully
**cURL Example:**
```bash
curl http://localhost:8000/languages/supported
```
---
### 5. Get Available Translation Pairs
Get list of available language translation pairs.
**Endpoint:** `GET /languages/pairs`
**Authentication:** None required
**Query Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `source` | string | No | Filter by source language |
| `target` | string | No | Filter by target language |
**Response:**
```json
{
"pairs": [
{
"source": "en",
"target": "es",
"available": true
},
{
"source": "en",
"target": "fr",
"available": true
},
{
"source": "es",
"target": "en",
"available": true
}
],
"total": 72
}
```
**Status Codes:**
- `200 OK` - Pairs retrieved successfully
**cURL Example:**
```bash
curl "http://localhost:8000/languages/pairs?source=en"
```
---
### 6. Create Room
Create a new translation room.
**Endpoint:** `POST /rooms`
**Authentication:** JWT Token or API Key
**Headers:**
```
Content-Type: application/json
Authorization: Bearer <token>
```
**Request Body:**
```json
{
"room_id": "meeting-room-123",
"name": "Team Meeting",
"max_users": 10,
"languages": ["en", "es", "fr"],
"settings": {
"auto_translate": true,
"record_session": false
}
}
```
**Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `room_id` | string | No | Custom room ID (auto-generated if not provided) |
| `name` | string | Yes | Room display name |
| `max_users` | integer | No | Maximum users (default: 10) |
| `languages` | array | No | Allowed languages (all if not specified) |
| `settings` | object | No | Room configuration |
**Response:**
```json
{
"room_id": "meeting-room-123",
"name": "Team Meeting",
"created_at": "2025-12-17T10:30:00Z",
"max_users": 10,
"current_users": 0,
"websocket_url": "ws://localhost:8000/ws"
}
```
**Status Codes:**
- `201 Created` - Room created successfully
- `400 Bad Request` - Invalid request body
- `401 Unauthorized` - Authentication required
- `409 Conflict` - Room ID already exists
**cURL Example:**
```bash
curl -X POST http://localhost:8000/rooms \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <token>" \
-d '{
"name": "Team Meeting",
"max_users": 10,
"languages": ["en", "es"]
}'
```
---
### 7. Get Room Information
Get details about a specific room.
**Endpoint:** `GET /rooms/{room_id}`
**Authentication:** JWT Token or API Key
**Path Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `room_id` | string | Yes | Room identifier |
**Response:**
```json
{
"room_id": "meeting-room-123",
"name": "Team Meeting",
"created_at": "2025-12-17T10:30:00Z",
"max_users": 10,
"current_users": 3,
"users": [
{
"user_id": "user_abc123",
"name": "Alice",
"language": "en",
"connected_at": "2025-12-17T10:31:00Z"
},
{
"user_id": "user_def456",
"name": "Bob",
"language": "es",
"connected_at": "2025-12-17T10:32:00Z"
}
],
"active": true
}
```
**Status Codes:**
- `200 OK` - Room found
- `401 Unauthorized` - Authentication required
- `404 Not Found` - Room does not exist
**cURL Example:**
```bash
curl http://localhost:8000/rooms/meeting-room-123 \
-H "Authorization: Bearer <token>"
```
---
### 8. List All Rooms
Get list of all active rooms.
**Endpoint:** `GET /rooms`
**Authentication:** JWT Token or API Key
**Query Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `page` | integer | No | Page number (default: 1) |
| `limit` | integer | No | Items per page (default: 20, max: 100) |
| `active` | boolean | No | Filter by active status |
**Response:**
```json
{
"rooms": [
{
"room_id": "meeting-room-123",
"name": "Team Meeting",
"current_users": 3,
"max_users": 10,
"active": true,
"created_at": "2025-12-17T10:30:00Z"
},
{
"room_id": "conference-456",
"name": "Conference Call",
"current_users": 5,
"max_users": 20,
"active": true,
"created_at": "2025-12-17T09:15:00Z"
}
],
"total": 15,
"page": 1,
"limit": 20,
"pages": 1
}
```
**Status Codes:**
- `200 OK` - Rooms retrieved successfully
- `401 Unauthorized` - Authentication required
**cURL Example:**
```bash
curl "http://localhost:8000/rooms?page=1&limit=20" \
-H "Authorization: Bearer <token>"
```
---
### 9. Delete Room
Delete a room and disconnect all users.
**Endpoint:** `DELETE /rooms/{room_id}`
**Authentication:** JWT Token or API Key (Admin)
**Path Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `room_id` | string | Yes | Room identifier |
**Response:**
```json
{
"success": true,
"room_id": "meeting-room-123",
"message": "Room deleted successfully",
"disconnected_users": 3
}
```
**Status Codes:**
- `200 OK` - Room deleted successfully
- `401 Unauthorized` - Authentication required
- `403 Forbidden` - Insufficient permissions
- `404 Not Found` - Room does not exist
**cURL Example:**
```bash
curl -X DELETE http://localhost:8000/rooms/meeting-room-123 \
-H "Authorization: Bearer <token>"
```
---
### 10. Get Server Statistics
Get server statistics and metrics.
**Endpoint:** `GET /stats`
**Authentication:** JWT Token or API Key
**Response:**
```json
{
"server": {
"uptime": 86400,
"version": "1.0.0",
"environment": "production"
},
"connections": {
"total": 150,
"active": 142,
"idle": 8
},
"rooms": {
"total": 25,
"active": 20,
"empty": 5
},
"workers": {
"translation": {
"total": 4,
"busy": 2,
"queue_size": 5
},
"tts": {
"total": 2,
"busy": 1,
"queue_size": 3
}
},
"processing": {
"total_translations": 5420,
"total_audio_processed_mb": 2850,
"avg_latency_ms": 245
},
"timestamp": "2025-12-17T10:30:00Z"
}
```
**Status Codes:**
- `200 OK` - Statistics retrieved successfully
- `401 Unauthorized` - Authentication required
**cURL Example:**
```bash
curl http://localhost:8000/stats \
-H "Authorization: Bearer <token>"
```
---
### 11. Text-Only Translation
Translate text without audio processing.
**Endpoint:** `POST /translate`
**Authentication:** JWT Token or API Key
**Headers:**
```
Content-Type: application/json
Authorization: Bearer <token>
```
**Request Body:**
```json
{
"text": "Hello, how are you?",
"source_language": "en",
"target_language": "es"
}
```
**Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `text` | string | Yes | Text to translate |
| `source_language` | string | Yes | Source language code |
| `target_language` | string | Yes | Target language code |
**Response:**
```json
{
"original_text": "Hello, how are you?",
"translated_text": "Hola, ΒΏcΓ³mo estΓ‘s?",
"source_language": "en",
"target_language": "es",
"processing_time_ms": 45
}
```
**Status Codes:**
- `200 OK` - Translation successful
- `400 Bad Request` - Invalid request body
- `401 Unauthorized` - Authentication required
- `422 Unprocessable Entity` - Unsupported language pair
**cURL Example:**
```bash
curl -X POST http://localhost:8000/translate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <token>" \
-d '{
"text": "Hello, how are you?",
"source_language": "en",
"target_language": "es"
}'
```
---
### 12. Batch Translation
Translate multiple texts in one request.
**Endpoint:** `POST /translate/batch`
**Authentication:** JWT Token or API Key
**Headers:**
```
Content-Type: application/json
Authorization: Bearer <token>
```
**Request Body:**
```json
{
"texts": [
"Hello, how are you?",
"What time is it?",
"Thank you very much"
],
"source_language": "en",
"target_language": "es"
}
```
**Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `texts` | array | Yes | Array of texts to translate (max 100) |
| `source_language` | string | Yes | Source language code |
| `target_language` | string | Yes | Target language code |
**Response:**
```json
{
"translations": [
{
"original": "Hello, how are you?",
"translated": "Hola, ΒΏcΓ³mo estΓ‘s?",
"index": 0
},
{
"original": "What time is it?",
"translated": "ΒΏQuΓ© hora es?",
"index": 1
},
{
"original": "Thank you very much",
"translated": "Muchas gracias",
"index": 2
}
],
"total": 3,
"source_language": "en",
"target_language": "es",
"processing_time_ms": 120
}
```
**Status Codes:**
- `200 OK` - Translations successful
- `400 Bad Request` - Invalid request body or too many texts
- `401 Unauthorized` - Authentication required
- `422 Unprocessable Entity` - Unsupported language pair
**cURL Example:**
```bash
curl -X POST http://localhost:8000/translate/batch \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <token>" \
-d '{
"texts": ["Hello", "Goodbye", "Thank you"],
"source_language": "en",
"target_language": "es"
}'
```
---
### 13. Download TTS Audio
Generate and download TTS audio for text.
**Endpoint:** `POST /tts/generate`
**Authentication:** JWT Token or API Key
**Headers:**
```
Content-Type: application/json
Authorization: Bearer <token>
```
**Request Body:**
```json
{
"text": "Hello, this is a test message",
"language": "en",
"format": "wav"
}
```
**Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `text` | string | Yes | Text to synthesize |
| `language` | string | Yes | Language code |
| `format` | string | No | Audio format: "wav", "mp3" (default: "wav") |
**Response:**
- Content-Type: `audio/wav` or `audio/mpeg`
- Body: Binary audio data
**Status Codes:**
- `200 OK` - Audio generated successfully
- `400 Bad Request` - Invalid request body
- `401 Unauthorized` - Authentication required
- `422 Unprocessable Entity` - Unsupported language
**cURL Example:**
```bash
curl -X POST http://localhost:8000/tts/generate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <token>" \
-d '{
"text": "Hello world",
"language": "en"
}' \
--output output.wav
```
---
### 14. System Configuration
Get current system configuration (Admin only).
**Endpoint:** `GET /config`
**Authentication:** JWT Token (Admin)
**Response:**
```json
{
"audio": {
"sample_rate": 16000,
"channels": 1,
"chunk_size": 4096,
"format": "PCM16"
},
"limits": {
"max_connections": 100,
"max_connections_per_ip": 10,
"max_users_per_room": 10,
"max_message_size": 10485760
},
"rate_limits": {
"messages_per_second": 10,
"requests_per_minute": 100
},
"workers": {
"translation_workers": 4,
"tts_workers": 2
},
"features": {
"authentication_enabled": false,
"rate_limiting_enabled": true,
"metrics_enabled": true
}
}
```
**Status Codes:**
- `200 OK` - Configuration retrieved
- `401 Unauthorized` - Authentication required
- `403 Forbidden` - Admin access required
**cURL Example:**
```bash
curl http://localhost:8000/config \
-H "Authorization: Bearer <admin-token>"
```
---
### REST API Response Format
All REST API responses follow this format:
**Success Response:**
```json
{
// Response data
}
```
**Error Response:**
```json
{
"error": {
"code": "ERROR_CODE",
"message": "Human readable error message",
"details": {
// Additional error details
}
}
}
```
---
## οΏ½πŸ” Authentication
### Optional JWT Authentication
If authentication is enabled (`ENABLE_AUTH=true`), include the JWT token in the WebSocket connection URL:
```
ws://localhost:8000/ws?token=YOUR_JWT_TOKEN
```
### Obtaining a Token
**Endpoint:** `POST /auth/token`
**Request Body:**
```json
{
"user_id": "user123",
"name": "John Doe"
}
```
**Response:**
```json
{
"access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"token_type": "bearer",
"expires_in": 3600
}
```
### API Key Authentication
Alternatively, use an API key in the query parameter:
```
ws://localhost:8000/ws?api_key=YOUR_API_KEY
```
---
## πŸ”Œ WebSocket Connection
### Connecting
**JavaScript Example:**
```javascript
const ws = new WebSocket('ws://localhost:8000/ws');
ws.onopen = () => {
console.log('Connected to translation server');
};
ws.onmessage = (event) => {
if (typeof event.data === 'string') {
// Text message (JSON)
const message = JSON.parse(event.data);
handleMessage(message);
} else {
// Binary message (audio data)
handleAudioData(event.data);
}
};
ws.onerror = (error) => {
console.error('WebSocket error:', error);
};
ws.onclose = () => {
console.log('Disconnected from server');
};
```
**Python Example:**
```python
import asyncio
import websockets
import json
async def connect():
uri = "ws://localhost:8000/ws"
async with websockets.connect(uri) as websocket:
# Send message
message = {
"type": "join_room",
"payload": {
"room_id": "room123",
"user_name": "Alice",
"language": "en"
}
}
await websocket.send(json.dumps(message))
# Receive messages
async for message in websocket:
if isinstance(message, str):
data = json.loads(message)
print(f"Received: {data}")
else:
print(f"Received audio: {len(message)} bytes")
asyncio.run(connect())
```
### Connection Limits
- **Max connections per IP:** 10 (configurable)
- **Max concurrent connections:** 100 (configurable)
- **Connection timeout:** 300 seconds (idle)
---
## πŸ“¨ Message Protocol
### Message Structure
All text messages are JSON with the following structure:
```json
{
"type": "MESSAGE_TYPE",
"payload": {
// Type-specific data
},
"timestamp": "2025-12-17T10:30:00Z"
}
```
### Message Flow
```
Client β†’ Server: Text Messages (JSON)
Server β†’ Client: Text Messages (JSON)
Client β†’ Server: Binary Messages (Audio Data)
Server β†’ Client: Binary Messages (Audio Data)
```
---
## πŸ“ Message Types
### 1. JOIN_ROOM
Join a translation room.
**Direction:** Client β†’ Server
**Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `room_id` | string | Yes | Room identifier |
| `user_name` | string | Yes | User display name |
| `language` | string | Yes | User's language code (e.g., "en", "es", "fr") |
**Example:**
```json
{
"type": "join_room",
"payload": {
"room_id": "room123",
"user_name": "Alice",
"language": "en"
}
}
```
**Response:**
```json
{
"type": "room_joined",
"payload": {
"room_id": "room123",
"user_id": "user_abc123",
"users": [
{
"user_id": "user_abc123",
"name": "Alice",
"language": "en"
},
{
"user_id": "user_def456",
"name": "Bob",
"language": "es"
}
]
},
"timestamp": "2025-12-17T10:30:00Z"
}
```
---
### 2. LEAVE_ROOM
Leave the current room.
**Direction:** Client β†’ Server
**Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `room_id` | string | Yes | Room identifier to leave |
**Example:**
```json
{
"type": "leave_room",
"payload": {
"room_id": "room123"
}
}
```
**Response:**
```json
{
"type": "room_left",
"payload": {
"room_id": "room123",
"user_id": "user_abc123"
},
"timestamp": "2025-12-17T10:35:00Z"
}
```
---
### 3. AUDIO_START
Notify that audio streaming will begin.
**Direction:** Client β†’ Server
**Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `room_id` | string | Yes | Room identifier |
| `audio_config` | object | No | Audio configuration |
| `audio_config.sample_rate` | integer | No | Sample rate in Hz (default: 16000) |
| `audio_config.channels` | integer | No | Number of channels (default: 1) |
| `audio_config.format` | string | No | Audio format (default: "PCM16") |
**Example:**
```json
{
"type": "audio_start",
"payload": {
"room_id": "room123",
"audio_config": {
"sample_rate": 16000,
"channels": 1,
"format": "PCM16"
}
}
}
```
**Response:**
```json
{
"type": "audio_started",
"payload": {
"room_id": "room123",
"user_id": "user_abc123",
"status": "ready"
},
"timestamp": "2025-12-17T10:31:00Z"
}
```
---
### 4. AUDIO_DATA (Binary)
Send audio data for translation.
**Direction:** Client β†’ Server (Binary)
**Format:** Raw PCM16 audio bytes
**Requirements:**
- Format: PCM16 (16-bit signed integer)
- Sample Rate: 16000 Hz (configurable)
- Channels: 1 (mono)
- Chunk Size: 4096 bytes (recommended)
**JavaScript Example:**
```javascript
// Capture audio from microphone
navigator.mediaDevices.getUserMedia({ audio: true })
.then(stream => {
const mediaRecorder = new MediaRecorder(stream);
mediaRecorder.ondataavailable = (event) => {
// Convert to PCM16 and send
const audioData = convertToPCM16(event.data);
ws.send(audioData);
};
mediaRecorder.start(100); // Send every 100ms
});
```
**Python Example:**
```python
import pyaudio
# Audio configuration
CHUNK = 4096
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
audio = pyaudio.PyAudio()
stream = audio.open(
format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK
)
# Send audio chunks
while True:
audio_data = stream.read(CHUNK)
await websocket.send(audio_data)
```
---
### 5. AUDIO_STOP
Notify that audio streaming has stopped.
**Direction:** Client β†’ Server
**Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `room_id` | string | Yes | Room identifier |
**Example:**
```json
{
"type": "audio_stop",
"payload": {
"room_id": "room123"
}
}
```
**Response:**
```json
{
"type": "audio_stopped",
"payload": {
"room_id": "room123",
"user_id": "user_abc123"
},
"timestamp": "2025-12-17T10:32:00Z"
}
```
---
### 6. TRANSLATION_RESULT
Receive translated text.
**Direction:** Server β†’ Client
**Parameters:**
| Parameter | Type | Description |
|-----------|------|-------------|
| `original_text` | string | Original recognized text |
| `translated_text` | string | Translated text |
| `source_language` | string | Source language code |
| `target_language` | string | Target language code |
| `source_user_id` | string | User who spoke |
**Example:**
```json
{
"type": "translation_result",
"payload": {
"original_text": "Hello, how are you?",
"translated_text": "Hola, ΒΏcΓ³mo estΓ‘s?",
"source_language": "en",
"target_language": "es",
"source_user_id": "user_abc123"
},
"timestamp": "2025-12-17T10:31:15Z"
}
```
---
### 7. TRANSLATED_AUDIO (Binary)
Receive translated audio.
**Direction:** Server β†’ Client (Binary)
**Format:** Raw PCM16 audio bytes ready for playback
**JavaScript Example:**
```javascript
ws.onmessage = (event) => {
if (event.data instanceof Blob) {
// Binary audio data
playAudio(event.data);
}
};
function playAudio(audioBlob) {
const audioContext = new AudioContext();
const reader = new FileReader();
reader.onload = (e) => {
audioContext.decodeAudioData(e.target.result, (buffer) => {
const source = audioContext.createBufferSource();
source.buffer = buffer;
source.connect(audioContext.destination);
source.start();
});
};
reader.readAsArrayBuffer(audioBlob);
}
```
---
### 8. USER_JOINED
Notification when a user joins the room.
**Direction:** Server β†’ Client
**Parameters:**
| Parameter | Type | Description |
|-----------|------|-------------|
| `room_id` | string | Room identifier |
| `user_id` | string | New user's ID |
| `user_name` | string | New user's name |
| `language` | string | New user's language |
**Example:**
```json
{
"type": "user_joined",
"payload": {
"room_id": "room123",
"user_id": "user_def456",
"user_name": "Bob",
"language": "es"
},
"timestamp": "2025-12-17T10:30:30Z"
}
```
---
### 9. USER_LEFT
Notification when a user leaves the room.
**Direction:** Server β†’ Client
**Parameters:**
| Parameter | Type | Description |
|-----------|------|-------------|
| `room_id` | string | Room identifier |
| `user_id` | string | User who left |
**Example:**
```json
{
"type": "user_left",
"payload": {
"room_id": "room123",
"user_id": "user_def456"
},
"timestamp": "2025-12-17T10:35:00Z"
}
```
---
### 10. PING / PONG
Heartbeat messages to keep connection alive.
**Direction:** Bidirectional
**PING (Server β†’ Client):**
```json
{
"type": "ping",
"payload": {},
"timestamp": "2025-12-17T10:31:00Z"
}
```
**PONG (Client β†’ Server):**
```json
{
"type": "pong",
"payload": {},
"timestamp": "2025-12-17T10:31:00Z"
}
```
**Configuration:**
- Ping interval: 30 seconds (default)
- Ping timeout: 10 seconds (default)
---
### 11. ERROR
Error message from server.
**Direction:** Server β†’ Client
**Parameters:**
| Parameter | Type | Description |
|-----------|------|-------------|
| `error_code` | string | Error code identifier |
| `message` | string | Human-readable error message |
| `details` | object | Additional error details (optional) |
**Example:**
```json
{
"type": "error",
"payload": {
"error_code": "ROOM_FULL",
"message": "Room has reached maximum capacity",
"details": {
"room_id": "room123",
"max_users": 10,
"current_users": 10
}
},
"timestamp": "2025-12-17T10:30:00Z"
}
```
**Common Error Codes:**
- `AUTH_FAILED`: Authentication failed
- `ROOM_NOT_FOUND`: Room does not exist
- `ROOM_FULL`: Room at maximum capacity
- `INVALID_MESSAGE`: Malformed message
- `RATE_LIMIT_EXCEEDED`: Too many requests
- `UNSUPPORTED_LANGUAGE`: Language not supported
- `AUDIO_PROCESSING_ERROR`: Audio processing failed
---
## ⚠️ Error Handling
### Client-Side Error Handling
```javascript
ws.onerror = (error) => {
console.error('WebSocket error:', error);
// Attempt reconnection
setTimeout(() => reconnect(), 5000);
};
ws.onclose = (event) => {
if (event.code === 1008) {
console.error('Connection closed: Rate limit exceeded');
} else if (event.code === 1000) {
console.log('Connection closed normally');
} else {
console.log('Connection closed unexpectedly:', event.code);
// Attempt reconnection
setTimeout(() => reconnect(), 5000);
}
};
```
### Close Codes
| Code | Description |
|------|-------------|
| 1000 | Normal closure |
| 1001 | Going away |
| 1008 | Policy violation (rate limit) |
| 1011 | Internal server error |
---
## 🚦 Rate Limits
### Connection Limits
| Limit Type | Default Value | Configurable |
|------------|---------------|--------------|
| Max connections per IP | 10 | Yes |
| Max total connections | 100 | Yes |
| Connection timeout | 300 seconds | Yes |
### Message Limits
| Limit Type | Default Value | Configurable |
|------------|---------------|--------------|
| Messages per second | 10 per connection | Yes |
| Requests per minute | 100 per user | Yes |
| Audio chunk size | 10 MB | Yes |
### Rate Limit Headers
Rate limit information is included in error responses:
```json
{
"type": "error",
"payload": {
"error_code": "RATE_LIMIT_EXCEEDED",
"message": "Too many requests",
"details": {
"limit": 100,
"remaining": 0,
"reset_at": "2025-12-17T10:31:00Z"
}
}
}
```
---
## πŸ’» Code Examples
### Complete Client Example (JavaScript)
```javascript
class VoiceTranslatorClient {
constructor(url, options = {}) {
this.url = url;
this.ws = null;
this.roomId = null;
this.userId = null;
this.options = {
language: options.language || 'en',
userName: options.userName || 'Anonymous',
...options
};
}
connect() {
return new Promise((resolve, reject) => {
this.ws = new WebSocket(this.url);
this.ws.onopen = () => {
console.log('Connected to translation server');
resolve();
};
this.ws.onerror = (error) => {
console.error('WebSocket error:', error);
reject(error);
};
this.ws.onmessage = (event) => {
this.handleMessage(event);
};
this.ws.onclose = () => {
console.log('Disconnected from server');
this.reconnect();
};
});
}
handleMessage(event) {
if (typeof event.data === 'string') {
const message = JSON.parse(event.data);
switch (message.type) {
case 'room_joined':
this.userId = message.payload.user_id;
this.onRoomJoined(message.payload);
break;
case 'translation_result':
this.onTranslation(message.payload);
break;
case 'user_joined':
this.onUserJoined(message.payload);
break;
case 'user_left':
this.onUserLeft(message.payload);
break;
case 'error':
this.onError(message.payload);
break;
case 'ping':
this.sendPong();
break;
}
} else {
// Binary audio data
this.onAudioReceived(event.data);
}
}
async joinRoom(roomId) {
this.roomId = roomId;
const message = {
type: 'join_room',
payload: {
room_id: roomId,
user_name: this.options.userName,
language: this.options.language
}
};
this.send(message);
}
async startAudio() {
const message = {
type: 'audio_start',
payload: {
room_id: this.roomId,
audio_config: {
sample_rate: 16000,
channels: 1,
format: 'PCM16'
}
}
};
this.send(message);
// Start capturing audio
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
this.startAudioCapture(stream);
}
startAudioCapture(stream) {
const audioContext = new AudioContext({ sampleRate: 16000 });
const source = audioContext.createMediaStreamSource(stream);
const processor = audioContext.createScriptProcessor(4096, 1, 1);
processor.onaudioprocess = (e) => {
const inputData = e.inputBuffer.getChannelData(0);
const pcm16 = this.convertToPCM16(inputData);
this.ws.send(pcm16);
};
source.connect(processor);
processor.connect(audioContext.destination);
}
convertToPCM16(float32Array) {
const int16Array = new Int16Array(float32Array.length);
for (let i = 0; i < float32Array.length; i++) {
const s = Math.max(-1, Math.min(1, float32Array[i]));
int16Array[i] = s < 0 ? s * 0x8000 : s * 0x7FFF;
}
return int16Array.buffer;
}
stopAudio() {
const message = {
type: 'audio_stop',
payload: {
room_id: this.roomId
}
};
this.send(message);
}
leaveRoom() {
const message = {
type: 'leave_room',
payload: {
room_id: this.roomId
}
};
this.send(message);
}
send(message) {
if (this.ws && this.ws.readyState === WebSocket.OPEN) {
this.ws.send(JSON.stringify(message));
}
}
sendPong() {
this.send({ type: 'pong', payload: {} });
}
disconnect() {
if (this.ws) {
this.ws.close();
}
}
reconnect() {
setTimeout(() => {
console.log('Attempting to reconnect...');
this.connect();
}, 5000);
}
// Event handlers (override these)
onRoomJoined(data) {
console.log('Joined room:', data);
}
onTranslation(data) {
console.log('Translation:', data.translated_text);
}
onAudioReceived(audioData) {
console.log('Received audio:', audioData.byteLength, 'bytes');
// Play the audio
}
onUserJoined(data) {
console.log('User joined:', data.user_name);
}
onUserLeft(data) {
console.log('User left:', data.user_id);
}
onError(error) {
console.error('Error:', error.message);
}
}
// Usage
const client = new VoiceTranslatorClient('ws://localhost:8000/ws', {
language: 'en',
userName: 'Alice'
});
await client.connect();
await client.joinRoom('room123');
await client.startAudio();
```
---
### Complete Client Example (Python)
```python
import asyncio
import websockets
import json
import pyaudio
class VoiceTranslatorClient:
def __init__(self, url, language='en', user_name='Anonymous'):
self.url = url
self.language = language
self.user_name = user_name
self.ws = None
self.room_id = None
self.user_id = None
self.running = False
async def connect(self):
self.ws = await websockets.connect(self.url)
print('Connected to translation server')
# Start message handler
asyncio.create_task(self.message_handler())
async def message_handler(self):
async for message in self.ws:
if isinstance(message, str):
data = json.loads(message)
await self.handle_message(data)
else:
await self.handle_audio(message)
async def handle_message(self, message):
msg_type = message.get('type')
payload = message.get('payload', {})
if msg_type == 'room_joined':
self.user_id = payload.get('user_id')
print(f"Joined room: {payload.get('room_id')}")
elif msg_type == 'translation_result':
print(f"Translation: {payload.get('translated_text')}")
elif msg_type == 'user_joined':
print(f"User joined: {payload.get('user_name')}")
elif msg_type == 'user_left':
print(f"User left: {payload.get('user_id')}")
elif msg_type == 'error':
print(f"Error: {payload.get('message')}")
elif msg_type == 'ping':
await self.send_pong()
async def handle_audio(self, audio_data):
print(f"Received audio: {len(audio_data)} bytes")
# Play audio here
async def join_room(self, room_id):
self.room_id = room_id
message = {
'type': 'join_room',
'payload': {
'room_id': room_id,
'user_name': self.user_name,
'language': self.language
}
}
await self.send(message)
async def start_audio(self):
message = {
'type': 'audio_start',
'payload': {
'room_id': self.room_id,
'audio_config': {
'sample_rate': 16000,
'channels': 1,
'format': 'PCM16'
}
}
}
await self.send(message)
# Start audio capture
asyncio.create_task(self.capture_audio())
async def capture_audio(self):
CHUNK = 4096
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
audio = pyaudio.PyAudio()
stream = audio.open(
format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK
)
self.running = True
while self.running:
audio_data = stream.read(CHUNK)
await self.ws.send(audio_data)
await asyncio.sleep(0.01)
stream.stop_stream()
stream.close()
audio.terminate()
async def stop_audio(self):
self.running = False
message = {
'type': 'audio_stop',
'payload': {
'room_id': self.room_id
}
}
await self.send(message)
async def leave_room(self):
message = {
'type': 'leave_room',
'payload': {
'room_id': self.room_id
}
}
await self.send(message)
async def send(self, message):
await self.ws.send(json.dumps(message))
async def send_pong(self):
await self.send({'type': 'pong', 'payload': {}})
async def disconnect(self):
await self.ws.close()
# Usage
async def main():
client = VoiceTranslatorClient(
'ws://localhost:8000/ws',
language='en',
user_name='Alice'
)
await client.connect()
await client.join_room('room123')
await client.start_audio()
# Keep running for 60 seconds
await asyncio.sleep(60)
await client.stop_audio()
await client.leave_room()
await client.disconnect()
asyncio.run(main())
```
---
## 🌍 Supported Languages
| Language Code | Language Name |
|---------------|---------------|
| `en` | English |
| `hi` | Hindi |
| `te` | Telugu |
| `ta` | Tamil |
| `kn` | Kannada |
| `ml` | Malayalam |
| `gu` | Gujarati |
| `mr` | Marathi |
| `bn` | Bengali |
| `es` | Spanish |
| `fr` | French |
| `de` | German |
| `it` | Italian |
| `pt` | Portuguese |
| `ru` | Russian |
| `zh` | Chinese |
| `ja` | Japanese |
**Primary Focus:** Indian languages (Hindi, Telugu, Tamil, Kannada, Malayalam, Gujarati, Marathi, Bengali)
**Note:** Language support depends on installed models. Check available languages with the `/languages` endpoint.
---
## πŸ“Š Health Check
**Endpoint:** `GET /health`
**Response:**
```json
{
"status": "healthy",
"version": "1.0.0",
"uptime": 3600,
"connections": 15,
"rooms": 3
}
```
---
## πŸ”§ Configuration
Environment variables to customize API behavior:
```bash
# Server
HOST=0.0.0.0
PORT=8000
# Audio
AUDIO_SAMPLE_RATE=16000
AUDIO_CHANNELS=1
AUDIO_CHUNK_SIZE=4096
# Security
ENABLE_AUTH=false
JWT_SECRET_KEY=your-secret-key
API_KEYS=key1,key2,key3
# Rate Limiting
MAX_CONNECTIONS_PER_IP=10
MAX_MESSAGES_PER_SECOND=10
MAX_REQUESTS_PER_MINUTE=100
# Workers
TRANSLATION_WORKERS=4
TTS_WORKERS=2
# Models
VOSK_MODEL_PATH_EN=models/vosk-en
ARGOS_MODEL_PATH=models/argos
COQUI_MODEL_PATH=models/coqui
```
---
## πŸ› Troubleshooting
### Connection Issues
**Problem:** Cannot connect to WebSocket
**Solutions:**
- Verify the server is running
- Check firewall settings
- Ensure correct URL (ws:// for HTTP, wss:// for HTTPS)
- Verify authentication token if required
### Audio Issues
**Problem:** No audio being received
**Solutions:**
- Check audio format (must be PCM16, 16kHz, mono)
- Verify microphone permissions
- Ensure audio chunks are correct size
- Check rate limits not exceeded
### Translation Issues
**Problem:** Translations not working
**Solutions:**
- Verify language models are installed
- Check language codes are supported
- Ensure room has users with different languages
- Check server logs for errors
---
## πŸ“ž Support
For issues and questions:
- GitHub Issues: [your-repo/issues]
- Email: support@your-domain.com
- Documentation: [your-docs-url]
---
## πŸ“„ License
This API documentation is part of the Voice-to-Voice Translator project.
**Version:** 1.0.0
**Last Updated:** December 17, 2025