w2vindia

w2vindia is a self-supervised speech representation model based on the Wav2Vec 2.0 Base architecture, trained from scratch on a multilingual corpus of Indian languages.

This model serves as a foundation acoustic model and does not generate text directly. It is intended for fine-tuning on downstream speech tasks such as ASR, phoneme recognition, or language identification.

Load the model

from transformers import Wav2Vec2Model, Wav2Vec2Processor

processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-base")
model = Wav2Vec2Model.from_pretrained("balaragavesh/w2vindia")

🔍 Model Description

Architecture: Wav2Vec 2.0 Base
Training Type: Self-supervised pretraining
Languages: Hindi, Tamil, Bengali, Marathi, Telugu, Kannada, Malayalam, Gujarati (and others)
Sampling Rate: 16 kHz
Framework: PyTorch / Hugging Face Transformers

Unlike language-specific models, this model was trained on a blind mixture of Indian languages without language identifiers, allowing it to learn shared phonetic and acoustic representations across languages.

📚 Dataset

The model was pre-trained on the IndicTTS dataset collection released by SPRING Lab, available on Hugging Face.

Source: SPRINGLab / IndicTTS Datasets
Total Duration: ~200 Hours
Data Characteristics: Mixed Indian languages (Hindi, Tamil, Marathi, Bengali, etc.)
Preprocessing:
- Sampling Rate: 16kHz
- Audio files were filtered to be between 2s and 15s in length to optimize attention mechanisms.
- Combined into a unified training set to encourage cross-lingual acoustic transfer.

🚀 Intended Use

✅ Supported Tasks

Automatic Speech Recognition (ASR)
Phoneme Recognition
Low-resource language modeling
Cross-lingual transfer learning

❌ Not Intended For

Direct speech-to-text inference without fine-tuning
Speaker identification without adaptation

🔗 Link

GitHub Repository:
👉 https://github.com/balaragavesh/w2vindia

Downloads last month: 6

Safetensors

Model size

95.1M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support