---
library_name: transformers
license: mit
datasets:
- ai4bharat/IndicVoices
language:
- hi
- gu
- mr
base_model:
- openai/whisper-large-v3
pipeline_tag: automatic-speech-recognition
---

# Open-Sarika

This is a speech recognition and translation model for Indian languages (Hindi, Gujarati, and Marathi). The model can transcribe speech in these languages and translate between them. This is an open-source implementation inspired by Sarvam AI's Sarika model.

## Model Details

### Model Description

- **Model type:** Speech Recognition and Translation (based on Whisper architecture)
- **Language(s):** Hindi (hi), Gujarati (gu), Marathi (mr)
- **License:** MIT
- **Base Model:** openai/whisper-large-v3

## Uses

### Direct Use

The model can be used for:
1. Transcribing speech in Hindi, Gujarati, and Marathi
2. Translating speech between these languages

Here's a simple example to get started:

```python
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch
import librosa

model_id = "theharshithh/open-sarika-v1"
device = "cuda" if torch.cuda.is_available() else "cpu"

# Load model and processor
processor = WhisperProcessor.from_pretrained(model_id)
model = WhisperForConditionalGeneration.from_pretrained(model_id).to(device)
model.config.forced_decoder_ids = None

# Load and process audio
audio_path = "your_audio.wav"
audio, rate = librosa.load(audio_path, sr=16000)

# Generate transcription
inputs = processor(audio, sampling_rate=16000, return_tensors="pt").to(device)
with torch.no_grad():
    output_ids = model.generate(**inputs)
transcription = processor.batch_decode(output_ids, skip_special_tokens=True)[0]
```

### Training Data

The model was trained on a variety of datasets, including:
- Project Vaani dataset: A large-scale Indian language collection project by the Indian Institute of Science (IISc) in collaboration with ARTPARK, funded by Google
- High-quality speech recordings in Hindi, Gujarati, and Marathi from AI4Bharat
- Real-world speech data from various sources

### Hardware Requirements

- Minimum RAM: 8GB
- GPU: Recommended for faster inference
- Storage: Model size is approximately 1.5GB

## Model Card Contact

For issues and feedback, please create an issue on the model's repository: https://huggingface.co/theharshithh/open-sarika-v1

## Github

Github Repo: https://github.com/theharshithh/open-sarika