STT / README.md
harsh2ai
Rebrand to Ringg Parrot STT V1
b672ef4
---
title: Ringg Parrot STT V1
emoji: 🦜
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: High-Accuracy Hindi Speech-to-Text System
---
tags:
- speech-to-text
- asr
- bilingual
- english
- hindi
- audio
- transcription
- ringg
- real-time
---
# πŸŽ™οΈ Ringg Parrot STT V1 :parrot:
**Bilingual Speech-to-Text for English & Hindi**
[![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/RinggAI/Ringg-STT-V0)
[![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](https://opensource.org/licenses/Apache-2.0)
## 🌟 Overview
Ringg Parrot STT V1 is a state-of-the-art speech-to-text system that provides real-time transcription for English and Hindi languages. Our model ranks **1st place** among top bilingual ASR models, outperforming OpenAI Whisper Large-v3 and other leading solutions.
## πŸ“Š Performance Benchmarks
| Model | Indic Norm WER ↓ | Whisper Norm WER ↓ |
|-------|------------------|---------------------|
| IndicWav2Vec (Winner) | 18.55% | 63.31% |
| **Ringg Parrot STT V1** | **21.03%** | **66.27%** |
| VakyanSh Wav2Vec2 | 24.06% | 66.34% |
| Whisper Large-v3 | 29.17% | 63.31% |
| Whisper Large-v2 | 37.50% | 66.27% |
**Lower WER (Word Error Rate) indicates better accuracy.** Ringg Parrot STT V1 achieves competitive performance while supporting bilingual transcription.
## ✨ Features
- 🌐 **Bilingual Support**: Native support for English and Hindi speech recognition
- ⚑ **Real-time Streaming**: Instant transcription as you speak
- 🎯 **High Accuracy**: 2nd place among top bilingual ASR models
- πŸ“ **File Upload**: Support for various audio formats (WAV, MP3, FLAC, M4A, etc.)
- πŸš€ **Fast Processing**: Optimized for low-latency inference
- πŸ’¬ **Code-switching**: Handles mixed English-Hindi speech
## 🎯 Model Details
| Specification | Details |
|--------------|---------|
| **Model Name** | Ringg Parrot STT V1 |
| **Languages** | English (EN) & Hindi (HI) |
| **Performance** | 2nd place among top models |
| **Sample Rate** | 16kHz |
## πŸš€ Usage
### Real-time Streaming
1. Go to the **"Real-time Streaming"** tab
2. Allow microphone permissions when prompted
3. Start speaking in English or Hindi
4. See real-time transcription appear
### File Upload
1. Go to the **"File Upload"** tab
2. Upload your audio file (WAV, MP3, FLAC, M4A, etc.)
3. Click **"Transcribe"**
4. View the transcription result
## πŸ’‘ Tips for Best Results
- **Audio Quality**: Use clear audio with minimal background noise
- **Speaking Style**: Speak naturally at a moderate pace
- **File Format**: 16kHz or higher sample rate recommended
- **Code-switching**: Model handles English-Hindi mixing, but accuracy is best when minimizing switches within sentences
## πŸ“Š Use Cases
- πŸ€– Voice assistants and chatbots
- πŸ“ Meeting transcription
- 🎬 Content creation and subtitling
- β™Ώ Accessibility applications
- πŸ” Voice search and commands
- πŸ“ž Call center automation
- πŸŽ“ Educational tools
- 🌍 Multilingual communication
## πŸ”§ Technical Details
### Audio Processing
- **Input Format**: Mono audio, automatically resampled to 16kHz
- **Processing**: Chunked streaming with 3-second buffers
- **Latency**: ~2-3 seconds for real-time streaming
- **GPU Acceleration**: CUDA-enabled for faster inference
### Supported Audio Formats
- WAV (PCM, 16-bit, 24-bit, 32-bit)
- MP3
- FLAC
- M4A
- OGG
- OPUS
## πŸ“ Limitations
- Works best with clear audio and minimal background noise
- Accuracy may vary with strong accents and dialects
- Code-switching within sentences may occasionally affect accuracy
- Very long audio files may take longer to process
## πŸ“ˆ Performance
- **WER (Word Error Rate)**: Optimized for conversational speech
- **RTF (Real-Time Factor)**: < 0.3 on GPU (faster than real-time)
- **Languages**: English & Hindi with native support
## πŸ”— Links
- **Organization**: [RinggAI on Hugging Face](https://huggingface.co/RinggAI)
- **TTS Space**: [Ringg TTS V0](https://huggingface.co/spaces/RinggAI/Ringg-TTS-v0.0)
## πŸ‘₯ Team
Made with ❀️ by the **RinggAI Team**
---
**Note**: This model is designed for research and development purposes. For production use, please ensure compliance with your local regulations regarding speech processing and data privacy.
| Dependency | Version |
|------------|---------|
| gradio | 5.49.1 |
| gradio-client | 1.13.3 |
| pandas | 2.3.3 |
| requests | 2.32.5 |