Spaces:
Sleeping
Voice-to-Voice Translator API Documentation
π Table of Contents
- Overview
- Base URL
- REST API Endpoints
- Authentication
- WebSocket Connection
- Message Protocol
- Message Types
- Error Handling
- Rate Limits
- Code Examples
π― Overview
The Voice-to-Voice Translator API provides real-time audio translation capabilities through WebSocket connections. Users can join translation rooms and receive live translations of audio streams.
Key Features:
- Real-time bidirectional audio translation
- Multi-room support
- Multiple language pairs
- Low-latency streaming
- JWT authentication (optional)
- Rate limiting and connection management
π Base URL
Development
ws://localhost:8000/ws
Production
wss://your-domain.com/ws
οΏ½ REST API Endpoints
The API provides several REST endpoints for management and information retrieval.
Base URL for REST API
Development: http://localhost:8000
Production: https://your-domain.com
1. Health Check
Get server health status.
Endpoint: GET /health
Authentication: None required
Response:
{
"status": "healthy",
"version": "1.0.0",
"uptime": 3600,
"connections": 15,
"rooms": 3,
"timestamp": "2025-12-17T10:30:00Z"
}
Status Codes:
200 OK- Server is healthy503 Service Unavailable- Server is unhealthy
cURL Example:
curl http://localhost:8000/health
2. Create Authentication Token
Generate a JWT token for WebSocket authentication.
Endpoint: POST /auth/token
Authentication: API Key (optional)
Headers:
Content-Type: application/json
X-API-Key: your-api-key (optional)
Request Body:
{
"user_id": "user123",
"name": "John Doe",
"metadata": {
"email": "john@example.com"
}
}
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
user_id |
string | Yes | Unique user identifier |
name |
string | Yes | User display name |
metadata |
object | No | Additional user metadata |
Response:
{
"access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"token_type": "bearer",
"expires_in": 3600,
"user_id": "user123"
}
Status Codes:
200 OK- Token created successfully400 Bad Request- Invalid request body401 Unauthorized- Invalid API key429 Too Many Requests- Rate limit exceeded
cURL Example:
curl -X POST http://localhost:8000/auth/token \
-H "Content-Type: application/json" \
-H "X-API-Key: your-api-key" \
-d '{
"user_id": "user123",
"name": "John Doe"
}'
3. Verify Token
Verify a JWT token's validity.
Endpoint: POST /auth/verify
Authentication: None required
Headers:
Content-Type: application/json
Request Body:
{
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}
Response:
{
"valid": true,
"user_id": "user123",
"expires_at": "2025-12-17T11:30:00Z"
}
Status Codes:
200 OK- Token is valid401 Unauthorized- Token is invalid or expired
cURL Example:
curl -X POST http://localhost:8000/auth/verify \
-H "Content-Type: application/json" \
-d '{
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}'
4. Get Supported Languages
Retrieve list of supported languages.
Endpoint: GET /languages
Authentication: None required
Response:
{
"languages": [
{
"code": "en",
"name": "English",
"stt_available": true,
"translation_available": true,
"tts_available": true
},
{
"code": "es",
"name": "Spanish",
"stt_available": true,
"translation_available": true,
"tts_available": true
},
{
"code": "fr",
"name": "French",
"stt_available": true,
"translation_available": true,
"tts_available": true
}
],
"total": 9
}
Status Codes:
200 OK- Languages retrieved successfully
cURL Example:
curl http://localhost:8000/languages/supported
5. Get Available Translation Pairs
Get list of available language translation pairs.
Endpoint: GET /languages/pairs
Authentication: None required
Query Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
source |
string | No | Filter by source language |
target |
string | No | Filter by target language |
Response:
{
"pairs": [
{
"source": "en",
"target": "es",
"available": true
},
{
"source": "en",
"target": "fr",
"available": true
},
{
"source": "es",
"target": "en",
"available": true
}
],
"total": 72
}
Status Codes:
200 OK- Pairs retrieved successfully
cURL Example:
curl "http://localhost:8000/languages/pairs?source=en"
6. Create Room
Create a new translation room.
Endpoint: POST /rooms
Authentication: JWT Token or API Key
Headers:
Content-Type: application/json
Authorization: Bearer <token>
Request Body:
{
"room_id": "meeting-room-123",
"name": "Team Meeting",
"max_users": 10,
"languages": ["en", "es", "fr"],
"settings": {
"auto_translate": true,
"record_session": false
}
}
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
room_id |
string | No | Custom room ID (auto-generated if not provided) |
name |
string | Yes | Room display name |
max_users |
integer | No | Maximum users (default: 10) |
languages |
array | No | Allowed languages (all if not specified) |
settings |
object | No | Room configuration |
Response:
{
"room_id": "meeting-room-123",
"name": "Team Meeting",
"created_at": "2025-12-17T10:30:00Z",
"max_users": 10,
"current_users": 0,
"websocket_url": "ws://localhost:8000/ws"
}
Status Codes:
201 Created- Room created successfully400 Bad Request- Invalid request body401 Unauthorized- Authentication required409 Conflict- Room ID already exists
cURL Example:
curl -X POST http://localhost:8000/rooms \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <token>" \
-d '{
"name": "Team Meeting",
"max_users": 10,
"languages": ["en", "es"]
}'
7. Get Room Information
Get details about a specific room.
Endpoint: GET /rooms/{room_id}
Authentication: JWT Token or API Key
Path Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
room_id |
string | Yes | Room identifier |
Response:
{
"room_id": "meeting-room-123",
"name": "Team Meeting",
"created_at": "2025-12-17T10:30:00Z",
"max_users": 10,
"current_users": 3,
"users": [
{
"user_id": "user_abc123",
"name": "Alice",
"language": "en",
"connected_at": "2025-12-17T10:31:00Z"
},
{
"user_id": "user_def456",
"name": "Bob",
"language": "es",
"connected_at": "2025-12-17T10:32:00Z"
}
],
"active": true
}
Status Codes:
200 OK- Room found401 Unauthorized- Authentication required404 Not Found- Room does not exist
cURL Example:
curl http://localhost:8000/rooms/meeting-room-123 \
-H "Authorization: Bearer <token>"
8. List All Rooms
Get list of all active rooms.
Endpoint: GET /rooms
Authentication: JWT Token or API Key
Query Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
page |
integer | No | Page number (default: 1) |
limit |
integer | No | Items per page (default: 20, max: 100) |
active |
boolean | No | Filter by active status |
Response:
{
"rooms": [
{
"room_id": "meeting-room-123",
"name": "Team Meeting",
"current_users": 3,
"max_users": 10,
"active": true,
"created_at": "2025-12-17T10:30:00Z"
},
{
"room_id": "conference-456",
"name": "Conference Call",
"current_users": 5,
"max_users": 20,
"active": true,
"created_at": "2025-12-17T09:15:00Z"
}
],
"total": 15,
"page": 1,
"limit": 20,
"pages": 1
}
Status Codes:
200 OK- Rooms retrieved successfully401 Unauthorized- Authentication required
cURL Example:
curl "http://localhost:8000/rooms?page=1&limit=20" \
-H "Authorization: Bearer <token>"
9. Delete Room
Delete a room and disconnect all users.
Endpoint: DELETE /rooms/{room_id}
Authentication: JWT Token or API Key (Admin)
Path Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
room_id |
string | Yes | Room identifier |
Response:
{
"success": true,
"room_id": "meeting-room-123",
"message": "Room deleted successfully",
"disconnected_users": 3
}
Status Codes:
200 OK- Room deleted successfully401 Unauthorized- Authentication required403 Forbidden- Insufficient permissions404 Not Found- Room does not exist
cURL Example:
curl -X DELETE http://localhost:8000/rooms/meeting-room-123 \
-H "Authorization: Bearer <token>"
10. Get Server Statistics
Get server statistics and metrics.
Endpoint: GET /stats
Authentication: JWT Token or API Key
Response:
{
"server": {
"uptime": 86400,
"version": "1.0.0",
"environment": "production"
},
"connections": {
"total": 150,
"active": 142,
"idle": 8
},
"rooms": {
"total": 25,
"active": 20,
"empty": 5
},
"workers": {
"translation": {
"total": 4,
"busy": 2,
"queue_size": 5
},
"tts": {
"total": 2,
"busy": 1,
"queue_size": 3
}
},
"processing": {
"total_translations": 5420,
"total_audio_processed_mb": 2850,
"avg_latency_ms": 245
},
"timestamp": "2025-12-17T10:30:00Z"
}
Status Codes:
200 OK- Statistics retrieved successfully401 Unauthorized- Authentication required
cURL Example:
curl http://localhost:8000/stats \
-H "Authorization: Bearer <token>"
11. Text-Only Translation
Translate text without audio processing.
Endpoint: POST /translate
Authentication: JWT Token or API Key
Headers:
Content-Type: application/json
Authorization: Bearer <token>
Request Body:
{
"text": "Hello, how are you?",
"source_language": "en",
"target_language": "es"
}
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
text |
string | Yes | Text to translate |
source_language |
string | Yes | Source language code |
target_language |
string | Yes | Target language code |
Response:
{
"original_text": "Hello, how are you?",
"translated_text": "Hola, ΒΏcΓ³mo estΓ‘s?",
"source_language": "en",
"target_language": "es",
"processing_time_ms": 45
}
Status Codes:
200 OK- Translation successful400 Bad Request- Invalid request body401 Unauthorized- Authentication required422 Unprocessable Entity- Unsupported language pair
cURL Example:
curl -X POST http://localhost:8000/translate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <token>" \
-d '{
"text": "Hello, how are you?",
"source_language": "en",
"target_language": "es"
}'
12. Batch Translation
Translate multiple texts in one request.
Endpoint: POST /translate/batch
Authentication: JWT Token or API Key
Headers:
Content-Type: application/json
Authorization: Bearer <token>
Request Body:
{
"texts": [
"Hello, how are you?",
"What time is it?",
"Thank you very much"
],
"source_language": "en",
"target_language": "es"
}
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
texts |
array | Yes | Array of texts to translate (max 100) |
source_language |
string | Yes | Source language code |
target_language |
string | Yes | Target language code |
Response:
{
"translations": [
{
"original": "Hello, how are you?",
"translated": "Hola, ΒΏcΓ³mo estΓ‘s?",
"index": 0
},
{
"original": "What time is it?",
"translated": "ΒΏQuΓ© hora es?",
"index": 1
},
{
"original": "Thank you very much",
"translated": "Muchas gracias",
"index": 2
}
],
"total": 3,
"source_language": "en",
"target_language": "es",
"processing_time_ms": 120
}
Status Codes:
200 OK- Translations successful400 Bad Request- Invalid request body or too many texts401 Unauthorized- Authentication required422 Unprocessable Entity- Unsupported language pair
cURL Example:
curl -X POST http://localhost:8000/translate/batch \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <token>" \
-d '{
"texts": ["Hello", "Goodbye", "Thank you"],
"source_language": "en",
"target_language": "es"
}'
13. Download TTS Audio
Generate and download TTS audio for text.
Endpoint: POST /tts/generate
Authentication: JWT Token or API Key
Headers:
Content-Type: application/json
Authorization: Bearer <token>
Request Body:
{
"text": "Hello, this is a test message",
"language": "en",
"format": "wav"
}
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
text |
string | Yes | Text to synthesize |
language |
string | Yes | Language code |
format |
string | No | Audio format: "wav", "mp3" (default: "wav") |
Response:
- Content-Type:
audio/wavoraudio/mpeg - Body: Binary audio data
Status Codes:
200 OK- Audio generated successfully400 Bad Request- Invalid request body401 Unauthorized- Authentication required422 Unprocessable Entity- Unsupported language
cURL Example:
curl -X POST http://localhost:8000/tts/generate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <token>" \
-d '{
"text": "Hello world",
"language": "en"
}' \
--output output.wav
14. System Configuration
Get current system configuration (Admin only).
Endpoint: GET /config
Authentication: JWT Token (Admin)
Response:
{
"audio": {
"sample_rate": 16000,
"channels": 1,
"chunk_size": 4096,
"format": "PCM16"
},
"limits": {
"max_connections": 100,
"max_connections_per_ip": 10,
"max_users_per_room": 10,
"max_message_size": 10485760
},
"rate_limits": {
"messages_per_second": 10,
"requests_per_minute": 100
},
"workers": {
"translation_workers": 4,
"tts_workers": 2
},
"features": {
"authentication_enabled": false,
"rate_limiting_enabled": true,
"metrics_enabled": true
}
}
Status Codes:
200 OK- Configuration retrieved401 Unauthorized- Authentication required403 Forbidden- Admin access required
cURL Example:
curl http://localhost:8000/config \
-H "Authorization: Bearer <admin-token>"
REST API Response Format
All REST API responses follow this format:
Success Response:
{
// Response data
}
Error Response:
{
"error": {
"code": "ERROR_CODE",
"message": "Human readable error message",
"details": {
// Additional error details
}
}
}
οΏ½π Authentication
Optional JWT Authentication
If authentication is enabled (ENABLE_AUTH=true), include the JWT token in the WebSocket connection URL:
ws://localhost:8000/ws?token=YOUR_JWT_TOKEN
Obtaining a Token
Endpoint: POST /auth/token
Request Body:
{
"user_id": "user123",
"name": "John Doe"
}
Response:
{
"access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"token_type": "bearer",
"expires_in": 3600
}
API Key Authentication
Alternatively, use an API key in the query parameter:
ws://localhost:8000/ws?api_key=YOUR_API_KEY
π WebSocket Connection
Connecting
JavaScript Example:
const ws = new WebSocket('ws://localhost:8000/ws');
ws.onopen = () => {
console.log('Connected to translation server');
};
ws.onmessage = (event) => {
if (typeof event.data === 'string') {
// Text message (JSON)
const message = JSON.parse(event.data);
handleMessage(message);
} else {
// Binary message (audio data)
handleAudioData(event.data);
}
};
ws.onerror = (error) => {
console.error('WebSocket error:', error);
};
ws.onclose = () => {
console.log('Disconnected from server');
};
Python Example:
import asyncio
import websockets
import json
async def connect():
uri = "ws://localhost:8000/ws"
async with websockets.connect(uri) as websocket:
# Send message
message = {
"type": "join_room",
"payload": {
"room_id": "room123",
"user_name": "Alice",
"language": "en"
}
}
await websocket.send(json.dumps(message))
# Receive messages
async for message in websocket:
if isinstance(message, str):
data = json.loads(message)
print(f"Received: {data}")
else:
print(f"Received audio: {len(message)} bytes")
asyncio.run(connect())
Connection Limits
- Max connections per IP: 10 (configurable)
- Max concurrent connections: 100 (configurable)
- Connection timeout: 300 seconds (idle)
π¨ Message Protocol
Message Structure
All text messages are JSON with the following structure:
{
"type": "MESSAGE_TYPE",
"payload": {
// Type-specific data
},
"timestamp": "2025-12-17T10:30:00Z"
}
Message Flow
Client β Server: Text Messages (JSON)
Server β Client: Text Messages (JSON)
Client β Server: Binary Messages (Audio Data)
Server β Client: Binary Messages (Audio Data)
π Message Types
1. JOIN_ROOM
Join a translation room.
Direction: Client β Server
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
room_id |
string | Yes | Room identifier |
user_name |
string | Yes | User display name |
language |
string | Yes | User's language code (e.g., "en", "es", "fr") |
Example:
{
"type": "join_room",
"payload": {
"room_id": "room123",
"user_name": "Alice",
"language": "en"
}
}
Response:
{
"type": "room_joined",
"payload": {
"room_id": "room123",
"user_id": "user_abc123",
"users": [
{
"user_id": "user_abc123",
"name": "Alice",
"language": "en"
},
{
"user_id": "user_def456",
"name": "Bob",
"language": "es"
}
]
},
"timestamp": "2025-12-17T10:30:00Z"
}
2. LEAVE_ROOM
Leave the current room.
Direction: Client β Server
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
room_id |
string | Yes | Room identifier to leave |
Example:
{
"type": "leave_room",
"payload": {
"room_id": "room123"
}
}
Response:
{
"type": "room_left",
"payload": {
"room_id": "room123",
"user_id": "user_abc123"
},
"timestamp": "2025-12-17T10:35:00Z"
}
3. AUDIO_START
Notify that audio streaming will begin.
Direction: Client β Server
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
room_id |
string | Yes | Room identifier |
audio_config |
object | No | Audio configuration |
audio_config.sample_rate |
integer | No | Sample rate in Hz (default: 16000) |
audio_config.channels |
integer | No | Number of channels (default: 1) |
audio_config.format |
string | No | Audio format (default: "PCM16") |
Example:
{
"type": "audio_start",
"payload": {
"room_id": "room123",
"audio_config": {
"sample_rate": 16000,
"channels": 1,
"format": "PCM16"
}
}
}
Response:
{
"type": "audio_started",
"payload": {
"room_id": "room123",
"user_id": "user_abc123",
"status": "ready"
},
"timestamp": "2025-12-17T10:31:00Z"
}
4. AUDIO_DATA (Binary)
Send audio data for translation.
Direction: Client β Server (Binary)
Format: Raw PCM16 audio bytes
Requirements:
- Format: PCM16 (16-bit signed integer)
- Sample Rate: 16000 Hz (configurable)
- Channels: 1 (mono)
- Chunk Size: 4096 bytes (recommended)
JavaScript Example:
// Capture audio from microphone
navigator.mediaDevices.getUserMedia({ audio: true })
.then(stream => {
const mediaRecorder = new MediaRecorder(stream);
mediaRecorder.ondataavailable = (event) => {
// Convert to PCM16 and send
const audioData = convertToPCM16(event.data);
ws.send(audioData);
};
mediaRecorder.start(100); // Send every 100ms
});
Python Example:
import pyaudio
# Audio configuration
CHUNK = 4096
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
audio = pyaudio.PyAudio()
stream = audio.open(
format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK
)
# Send audio chunks
while True:
audio_data = stream.read(CHUNK)
await websocket.send(audio_data)
5. AUDIO_STOP
Notify that audio streaming has stopped.
Direction: Client β Server
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
room_id |
string | Yes | Room identifier |
Example:
{
"type": "audio_stop",
"payload": {
"room_id": "room123"
}
}
Response:
{
"type": "audio_stopped",
"payload": {
"room_id": "room123",
"user_id": "user_abc123"
},
"timestamp": "2025-12-17T10:32:00Z"
}
6. TRANSLATION_RESULT
Receive translated text.
Direction: Server β Client
Parameters:
| Parameter | Type | Description |
|---|---|---|
original_text |
string | Original recognized text |
translated_text |
string | Translated text |
source_language |
string | Source language code |
target_language |
string | Target language code |
source_user_id |
string | User who spoke |
Example:
{
"type": "translation_result",
"payload": {
"original_text": "Hello, how are you?",
"translated_text": "Hola, ΒΏcΓ³mo estΓ‘s?",
"source_language": "en",
"target_language": "es",
"source_user_id": "user_abc123"
},
"timestamp": "2025-12-17T10:31:15Z"
}
7. TRANSLATED_AUDIO (Binary)
Receive translated audio.
Direction: Server β Client (Binary)
Format: Raw PCM16 audio bytes ready for playback
JavaScript Example:
ws.onmessage = (event) => {
if (event.data instanceof Blob) {
// Binary audio data
playAudio(event.data);
}
};
function playAudio(audioBlob) {
const audioContext = new AudioContext();
const reader = new FileReader();
reader.onload = (e) => {
audioContext.decodeAudioData(e.target.result, (buffer) => {
const source = audioContext.createBufferSource();
source.buffer = buffer;
source.connect(audioContext.destination);
source.start();
});
};
reader.readAsArrayBuffer(audioBlob);
}
8. USER_JOINED
Notification when a user joins the room.
Direction: Server β Client
Parameters:
| Parameter | Type | Description |
|---|---|---|
room_id |
string | Room identifier |
user_id |
string | New user's ID |
user_name |
string | New user's name |
language |
string | New user's language |
Example:
{
"type": "user_joined",
"payload": {
"room_id": "room123",
"user_id": "user_def456",
"user_name": "Bob",
"language": "es"
},
"timestamp": "2025-12-17T10:30:30Z"
}
9. USER_LEFT
Notification when a user leaves the room.
Direction: Server β Client
Parameters:
| Parameter | Type | Description |
|---|---|---|
room_id |
string | Room identifier |
user_id |
string | User who left |
Example:
{
"type": "user_left",
"payload": {
"room_id": "room123",
"user_id": "user_def456"
},
"timestamp": "2025-12-17T10:35:00Z"
}
10. PING / PONG
Heartbeat messages to keep connection alive.
Direction: Bidirectional
PING (Server β Client):
{
"type": "ping",
"payload": {},
"timestamp": "2025-12-17T10:31:00Z"
}
PONG (Client β Server):
{
"type": "pong",
"payload": {},
"timestamp": "2025-12-17T10:31:00Z"
}
Configuration:
- Ping interval: 30 seconds (default)
- Ping timeout: 10 seconds (default)
11. ERROR
Error message from server.
Direction: Server β Client
Parameters:
| Parameter | Type | Description |
|---|---|---|
error_code |
string | Error code identifier |
message |
string | Human-readable error message |
details |
object | Additional error details (optional) |
Example:
{
"type": "error",
"payload": {
"error_code": "ROOM_FULL",
"message": "Room has reached maximum capacity",
"details": {
"room_id": "room123",
"max_users": 10,
"current_users": 10
}
},
"timestamp": "2025-12-17T10:30:00Z"
}
Common Error Codes:
AUTH_FAILED: Authentication failedROOM_NOT_FOUND: Room does not existROOM_FULL: Room at maximum capacityINVALID_MESSAGE: Malformed messageRATE_LIMIT_EXCEEDED: Too many requestsUNSUPPORTED_LANGUAGE: Language not supportedAUDIO_PROCESSING_ERROR: Audio processing failed
β οΈ Error Handling
Client-Side Error Handling
ws.onerror = (error) => {
console.error('WebSocket error:', error);
// Attempt reconnection
setTimeout(() => reconnect(), 5000);
};
ws.onclose = (event) => {
if (event.code === 1008) {
console.error('Connection closed: Rate limit exceeded');
} else if (event.code === 1000) {
console.log('Connection closed normally');
} else {
console.log('Connection closed unexpectedly:', event.code);
// Attempt reconnection
setTimeout(() => reconnect(), 5000);
}
};
Close Codes
| Code | Description |
|---|---|
| 1000 | Normal closure |
| 1001 | Going away |
| 1008 | Policy violation (rate limit) |
| 1011 | Internal server error |
π¦ Rate Limits
Connection Limits
| Limit Type | Default Value | Configurable |
|---|---|---|
| Max connections per IP | 10 | Yes |
| Max total connections | 100 | Yes |
| Connection timeout | 300 seconds | Yes |
Message Limits
| Limit Type | Default Value | Configurable |
|---|---|---|
| Messages per second | 10 per connection | Yes |
| Requests per minute | 100 per user | Yes |
| Audio chunk size | 10 MB | Yes |
Rate Limit Headers
Rate limit information is included in error responses:
{
"type": "error",
"payload": {
"error_code": "RATE_LIMIT_EXCEEDED",
"message": "Too many requests",
"details": {
"limit": 100,
"remaining": 0,
"reset_at": "2025-12-17T10:31:00Z"
}
}
}
π» Code Examples
Complete Client Example (JavaScript)
class VoiceTranslatorClient {
constructor(url, options = {}) {
this.url = url;
this.ws = null;
this.roomId = null;
this.userId = null;
this.options = {
language: options.language || 'en',
userName: options.userName || 'Anonymous',
...options
};
}
connect() {
return new Promise((resolve, reject) => {
this.ws = new WebSocket(this.url);
this.ws.onopen = () => {
console.log('Connected to translation server');
resolve();
};
this.ws.onerror = (error) => {
console.error('WebSocket error:', error);
reject(error);
};
this.ws.onmessage = (event) => {
this.handleMessage(event);
};
this.ws.onclose = () => {
console.log('Disconnected from server');
this.reconnect();
};
});
}
handleMessage(event) {
if (typeof event.data === 'string') {
const message = JSON.parse(event.data);
switch (message.type) {
case 'room_joined':
this.userId = message.payload.user_id;
this.onRoomJoined(message.payload);
break;
case 'translation_result':
this.onTranslation(message.payload);
break;
case 'user_joined':
this.onUserJoined(message.payload);
break;
case 'user_left':
this.onUserLeft(message.payload);
break;
case 'error':
this.onError(message.payload);
break;
case 'ping':
this.sendPong();
break;
}
} else {
// Binary audio data
this.onAudioReceived(event.data);
}
}
async joinRoom(roomId) {
this.roomId = roomId;
const message = {
type: 'join_room',
payload: {
room_id: roomId,
user_name: this.options.userName,
language: this.options.language
}
};
this.send(message);
}
async startAudio() {
const message = {
type: 'audio_start',
payload: {
room_id: this.roomId,
audio_config: {
sample_rate: 16000,
channels: 1,
format: 'PCM16'
}
}
};
this.send(message);
// Start capturing audio
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
this.startAudioCapture(stream);
}
startAudioCapture(stream) {
const audioContext = new AudioContext({ sampleRate: 16000 });
const source = audioContext.createMediaStreamSource(stream);
const processor = audioContext.createScriptProcessor(4096, 1, 1);
processor.onaudioprocess = (e) => {
const inputData = e.inputBuffer.getChannelData(0);
const pcm16 = this.convertToPCM16(inputData);
this.ws.send(pcm16);
};
source.connect(processor);
processor.connect(audioContext.destination);
}
convertToPCM16(float32Array) {
const int16Array = new Int16Array(float32Array.length);
for (let i = 0; i < float32Array.length; i++) {
const s = Math.max(-1, Math.min(1, float32Array[i]));
int16Array[i] = s < 0 ? s * 0x8000 : s * 0x7FFF;
}
return int16Array.buffer;
}
stopAudio() {
const message = {
type: 'audio_stop',
payload: {
room_id: this.roomId
}
};
this.send(message);
}
leaveRoom() {
const message = {
type: 'leave_room',
payload: {
room_id: this.roomId
}
};
this.send(message);
}
send(message) {
if (this.ws && this.ws.readyState === WebSocket.OPEN) {
this.ws.send(JSON.stringify(message));
}
}
sendPong() {
this.send({ type: 'pong', payload: {} });
}
disconnect() {
if (this.ws) {
this.ws.close();
}
}
reconnect() {
setTimeout(() => {
console.log('Attempting to reconnect...');
this.connect();
}, 5000);
}
// Event handlers (override these)
onRoomJoined(data) {
console.log('Joined room:', data);
}
onTranslation(data) {
console.log('Translation:', data.translated_text);
}
onAudioReceived(audioData) {
console.log('Received audio:', audioData.byteLength, 'bytes');
// Play the audio
}
onUserJoined(data) {
console.log('User joined:', data.user_name);
}
onUserLeft(data) {
console.log('User left:', data.user_id);
}
onError(error) {
console.error('Error:', error.message);
}
}
// Usage
const client = new VoiceTranslatorClient('ws://localhost:8000/ws', {
language: 'en',
userName: 'Alice'
});
await client.connect();
await client.joinRoom('room123');
await client.startAudio();
Complete Client Example (Python)
import asyncio
import websockets
import json
import pyaudio
class VoiceTranslatorClient:
def __init__(self, url, language='en', user_name='Anonymous'):
self.url = url
self.language = language
self.user_name = user_name
self.ws = None
self.room_id = None
self.user_id = None
self.running = False
async def connect(self):
self.ws = await websockets.connect(self.url)
print('Connected to translation server')
# Start message handler
asyncio.create_task(self.message_handler())
async def message_handler(self):
async for message in self.ws:
if isinstance(message, str):
data = json.loads(message)
await self.handle_message(data)
else:
await self.handle_audio(message)
async def handle_message(self, message):
msg_type = message.get('type')
payload = message.get('payload', {})
if msg_type == 'room_joined':
self.user_id = payload.get('user_id')
print(f"Joined room: {payload.get('room_id')}")
elif msg_type == 'translation_result':
print(f"Translation: {payload.get('translated_text')}")
elif msg_type == 'user_joined':
print(f"User joined: {payload.get('user_name')}")
elif msg_type == 'user_left':
print(f"User left: {payload.get('user_id')}")
elif msg_type == 'error':
print(f"Error: {payload.get('message')}")
elif msg_type == 'ping':
await self.send_pong()
async def handle_audio(self, audio_data):
print(f"Received audio: {len(audio_data)} bytes")
# Play audio here
async def join_room(self, room_id):
self.room_id = room_id
message = {
'type': 'join_room',
'payload': {
'room_id': room_id,
'user_name': self.user_name,
'language': self.language
}
}
await self.send(message)
async def start_audio(self):
message = {
'type': 'audio_start',
'payload': {
'room_id': self.room_id,
'audio_config': {
'sample_rate': 16000,
'channels': 1,
'format': 'PCM16'
}
}
}
await self.send(message)
# Start audio capture
asyncio.create_task(self.capture_audio())
async def capture_audio(self):
CHUNK = 4096
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
audio = pyaudio.PyAudio()
stream = audio.open(
format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK
)
self.running = True
while self.running:
audio_data = stream.read(CHUNK)
await self.ws.send(audio_data)
await asyncio.sleep(0.01)
stream.stop_stream()
stream.close()
audio.terminate()
async def stop_audio(self):
self.running = False
message = {
'type': 'audio_stop',
'payload': {
'room_id': self.room_id
}
}
await self.send(message)
async def leave_room(self):
message = {
'type': 'leave_room',
'payload': {
'room_id': self.room_id
}
}
await self.send(message)
async def send(self, message):
await self.ws.send(json.dumps(message))
async def send_pong(self):
await self.send({'type': 'pong', 'payload': {}})
async def disconnect(self):
await self.ws.close()
# Usage
async def main():
client = VoiceTranslatorClient(
'ws://localhost:8000/ws',
language='en',
user_name='Alice'
)
await client.connect()
await client.join_room('room123')
await client.start_audio()
# Keep running for 60 seconds
await asyncio.sleep(60)
await client.stop_audio()
await client.leave_room()
await client.disconnect()
asyncio.run(main())
π Supported Languages
| Language Code | Language Name |
|---|---|
en |
English |
hi |
Hindi |
te |
Telugu |
ta |
Tamil |
kn |
Kannada |
ml |
Malayalam |
gu |
Gujarati |
mr |
Marathi |
bn |
Bengali |
es |
Spanish |
fr |
French |
de |
German |
it |
Italian |
pt |
Portuguese |
ru |
Russian |
zh |
Chinese |
ja |
Japanese |
Primary Focus: Indian languages (Hindi, Telugu, Tamil, Kannada, Malayalam, Gujarati, Marathi, Bengali)
Note: Language support depends on installed models. Check available languages with the /languages endpoint.
π Health Check
Endpoint: GET /health
Response:
{
"status": "healthy",
"version": "1.0.0",
"uptime": 3600,
"connections": 15,
"rooms": 3
}
π§ Configuration
Environment variables to customize API behavior:
# Server
HOST=0.0.0.0
PORT=8000
# Audio
AUDIO_SAMPLE_RATE=16000
AUDIO_CHANNELS=1
AUDIO_CHUNK_SIZE=4096
# Security
ENABLE_AUTH=false
JWT_SECRET_KEY=your-secret-key
API_KEYS=key1,key2,key3
# Rate Limiting
MAX_CONNECTIONS_PER_IP=10
MAX_MESSAGES_PER_SECOND=10
MAX_REQUESTS_PER_MINUTE=100
# Workers
TRANSLATION_WORKERS=4
TTS_WORKERS=2
# Models
VOSK_MODEL_PATH_EN=models/vosk-en
ARGOS_MODEL_PATH=models/argos
COQUI_MODEL_PATH=models/coqui
π Troubleshooting
Connection Issues
Problem: Cannot connect to WebSocket
Solutions:
- Verify the server is running
- Check firewall settings
- Ensure correct URL (ws:// for HTTP, wss:// for HTTPS)
- Verify authentication token if required
Audio Issues
Problem: No audio being received
Solutions:
- Check audio format (must be PCM16, 16kHz, mono)
- Verify microphone permissions
- Ensure audio chunks are correct size
- Check rate limits not exceeded
Translation Issues
Problem: Translations not working
Solutions:
- Verify language models are installed
- Check language codes are supported
- Ensure room has users with different languages
- Check server logs for errors
π Support
For issues and questions:
- GitHub Issues: [your-repo/issues]
- Email: support@your-domain.com
- Documentation: [your-docs-url]
π License
This API documentation is part of the Voice-to-Voice Translator project.
Version: 1.0.0
Last Updated: December 17, 2025