|
|
---
|
|
|
license: mit
|
|
|
language:
|
|
|
- as
|
|
|
base_model:
|
|
|
- openai/whisper-medium
|
|
|
---
|
|
|
|
|
|
# Assamese Dialect Classification Model
|
|
|
|
|
|
This repository contains a trained model for classifying Assamese dialects based on speech inputs. The model was developed to assist in identifying and understanding regional variations of the Assamese language.
|
|
|
|
|
|
---
|
|
|
|
|
|
## **Model Purpose**
|
|
|
The purpose of this model is to classify different dialects of Assamese speech. It is useful for linguistic research, speech analysis, and creating dialect-aware applications in natural language processing (NLP) and automatic speech recognition (ASR).
|
|
|
|
|
|
---
|
|
|
|
|
|
## **Dialects Recognized**
|
|
|
The model is trained to recognize the following four dialects of the Assamese language:
|
|
|
1. **Darangia**
|
|
|
2. **Kamrupia**
|
|
|
3. **Nalbaria**
|
|
|
4. **Upper Assam**
|
|
|
|
|
|
---
|
|
|
|
|
|
## **Training Dataset**
|
|
|
The model was trained on a dataset of 300 speech samples, curated to include diverse speakers, phrases, and dialect features. The dataset includes:
|
|
|
- **Diverse Data**: Various accents, speaker genders, and age groups.
|
|
|
- **Metadata**: Information about speaker age, gender, district, and speech duration.
|
|
|
- **Common Phrases**: Speech samples based on frequently used phrases in Assamese.
|
|
|
|
|
|
---
|
|
|
|
|
|
|