languageClassifier / README.md
hriteshMaikap's picture
Upload README.md with huggingface_hub
983f020 verified
|
raw
history blame
1.72 kB
metadata
language:
  - en
tags:
  - audio
  - language-identification
  - speech
  - indian-languages
datasets:
  - hmsolanki/indian-languages-audio-dataset
metrics:
  - accuracy
  - f1

Indian Language Identification Model

This model identifies the language spoken in an audio clip from a set of 10 Indian languages.

Model Details

  • Model Type: Audio Language Classifier
  • Languages Supported: Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Punjabi, Tamil, Telugu, Urdu
  • Framework: PyTorch
  • Training Dataset: Indian Languages Audio Dataset
  • Audio Sampling Rate: 16kHz

Performance

  • Accuracy: 0.8465
  • Precision: 0.8457
  • Recall: 0.8465
  • F1 Score: 0.8452

Usage

import torch
import torchaudio
import json
from transformers import pipeline

# Load the model
pipe = pipeline("audio-classification", model="prithvirajjadhav2266/indian-language-identifier")

# Or use it directly
waveform, sample_rate = torchaudio.load("path/to/audio.wav")
if sample_rate != 16000:
    resampler = torchaudio.transforms.Resample(sample_rate, 16000)
    waveform = resampler(waveform)

# Get prediction
prediction = pipe(waveform)
print(f"Detected language: {prediction[0]['label']}")

Limitations

  • Works best with clear audio without background noise
  • Audio should be sampled at 16kHz for optimal performance

Training Details

This model was trained on a dataset of Indian language audio samples. The model architecture combines CNN layers for feature extraction with transformer layers for classification.

Confusion Matrix

Confusion Matrix