Spaces:

mutisya
/

polyglot-backend-quant

Sleeping

polyglot-backend-quant / README.md.bak

Deploy Polyglot backend with quantized models

d6e8bff verified 3 months ago

1.37 kB

	---
	title: Polyglot Translation Backend
	emoji: 🌍
	colorFrom: blue
	colorTo: green
	sdk: docker
	pinned: false
	license: mit
	app_port: 7860
	---

	# Polyglot Translation Backend - Quantized Models

	Real-time speech transcription and translation API with Socket.IO for WebSocket communication. This version uses INT8 quantized models for improved performance and reduced memory footprint.

	## Features

	- Real-time Speech Recognition: Support for English, Swahili, Kikuyu, Kamba, Kimeru, Luo, and Somali
	- Translation: Multi-language translation using NLLB models
	- Text-to-Speech: Generate speech in multiple languages
	- WebSocket Support: Real-time communication via Socket.IO
	- Model Quantization: INT8 dynamic quantization for faster inference

	## API Endpoints

	- `GET /health` - Health check endpoint
	- `WebSocket /` - Socket.IO connection for real-time communication

	## Environment

	This Space requires a HuggingFace token for model access. The token is automatically provided by HuggingFace Spaces when configured as a secret.

	## Technical Details

	- Framework: FastAPI with Socket.IO
	- Models:
	- ASR: Whisper (English) and Wav2Vec2-BERT (African languages)
	- Translation: NLLB-600M fine-tuned model
	- TTS: VITS models for each language
	- Optimization: INT8 dynamic quantization via PyTorch