How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Saugat20021/sahayak-voice-models",
	filename="llm/himalaya-gemma-toolcall.gguf",
)
llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Sahayak Voice Models

The full model bundle behind the Sahayak voice banking assistant: TTS, STT, and LLM, one folder each. tts/ and stt/ are original fine-tunes; llm/'s base weights are an unmodified third-party public model (only its Modelfile is original). All are included here so this repo is a complete, self-contained, reproducible bundle of everything the assistant runs. Attribution for each is below.

tts/ — original fine-tune

File What it is
t3_nepali_checkpoint.pt Fine-tuned T3 (text→speech-token) model, trained on Nepali
tokenizer_np.json Nepali BPE+grapheme tokenizer for the fine-tuned T3 model

This is a Nepali fine-tune of the T3 stage of ResembleAI/chatterbox. You still need the base model's other components to use it — ve.safetensors (voice encoder), s3gen.safetensors (mel generator), conds.pt (default voice conditioning) — download those from the original ResembleAI repo; they are unmodified upstream weights and aren't re-hosted here.

stt/ — fine-tuned on the Hindi version

Based on ai4bharat/indic-conformer-600m-multilingual (ONNX conformer encoder + per-language CTC/RNNT decoder heads, 22 Indic languages including Nepali) — further fine-tuned here on the Hindi version of the model to adapt it for this assistant. All credit for the base architecture/weights belongs to the original AI4Bharat repo; the fine-tuning on top of it was done for this project.

llm/ — base weights are a mirror; the Modelfile is original

File What it is
himalaya-gemma-toolcall.gguf Same weights as himalaya-ai/himalaya-gemma-4-e2b-it-ggufnot modified/fine-tuned further here
Modelfile An Ollama Modelfile adding a tool-calling prompt template (JSON function-call format) on top of the base model — this is the only original artifact in this folder

The underlying weights are unchanged from Himalaya AI Labs' public Gemma-based Nepali model. All credit and license terms belong to that original repo.

Summary of what's actually original here

tts/t3_nepali_checkpoint.pt + tts/tokenizer_np.json (Nepali TTS fine-tune), the stt/ model (fine-tuned from the Hindi version of IndicConformer), and llm/Modelfile were produced for this project. The llm/himalaya-gemma-toolcall.gguf weights are an unmodified copy of a third-party public model, included for a complete, reproducible bundle — not a claim of authorship over those weights.

Downloads last month
-
GGUF
Model size
5B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Saugat20021/sahayak-voice-models

Quantized
(24)
this model