KannadaGPT-0.6B

A Kannada language model fine-tuned on Qwen3-0.6B using LoRA (Low-Rank Adaptation).

Model Details

Property	Value
Base Model	Qwen/Qwen3-0.6B
Language	Kannada (ಕನ್ನಡ)
Fine-tuning Method	LoRA (Low-Rank Adaptation)
Training Data	Cognitive-Lab/Kannada-Instruct-dataset
Training Samples	389,608
Base Parameters	0.6B
Trainable Parameters	2.29M (0.38%)

Training Configuration

Parameter	Value
LoRA Rank	16
LoRA Alpha	32
Dropout	0.05
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Learning Rate	2e-4
Batch Size	2 (with gradient accumulation 8)
Epochs	2

Quick Start

Try it in Google Colab:

Installation

pip install transformers peft torch accelerate

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-0.6B",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Mithun501/KannadaGPT-0.6B")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Mithun501/KannadaGPT-0.6B")

# Generate text
messages = [
    {"role": "user", "content": "ಭಾರತದ ರಾಜಧಾನಿ ಯಾವುದು?"}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.8)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Example Prompts

Kannada Prompt	English Translation
ಭಾರತದ ರಾಜಧಾನಿ ಯಾವುದು?	What is the capital of India?
ಆರೋಗ್ಯವಾಗಿರಲು ಮೂರು ಸಲಹೆಗಳನ್ನು ನೀಡಿ	Give three tips for staying healthy
ಕನ್ನಡದಲ್ಲಿ ಕವಿತೆ ಬರೆಯಿರಿ	Write a poem in Kannada
ಬೆಂಗಳೂರಿನ ಬಗ್ಗೆ ಹೇಳಿ	Tell me about Bangalore
ಮಳೆ ಏಕೆ ಬರುತ್ತದೆ?	Why does it rain?

Training Progress

The model was trained on Kaggle with P100 GPU. Training metrics from checkpoint-4500:

Step	Loss	Learning Rate
50	1.459	6.7e-06
500	0.675	6.8e-05
1000	0.613	1.4e-04
1500	0.572	2.0e-04
2000	0.534	2.0e-04
2500	0.518	2.0e-04
3000	0.502	1.9e-04
3500	0.492	1.9e-04
4000	0.488	1.9e-04
4500	0.470	1.9e-04

Training Progress: 4,500 / 48,702 steps (9.2% complete, epoch 0.185/2.0)

Project Structure

KannadaGPT-0.6B/
├── adapter_config.json        # LoRA configuration
├── adapter_model.safetensors  # LoRA weights (38MB)
├── tokenizer.json             # Tokenizer
├── tokenizer_config.json      # Tokenizer config
├── vocab.json                 # Vocabulary
├── merges.txt                 # BPE merges
├── special_tokens_map.json    # Special tokens
├── added_tokens.json          # Added tokens
├── chat_template.jinja        # Chat template
├── KannadaGPT_Inference.ipynb # Colab inference notebook
├── KannadaGPT_Merge.ipynb     # Colab merge notebook
└── README.md                  # This file

Limitations

This is a LoRA adapter and requires the base model (Qwen3-0.6B) to run
Training is partial (checkpoint-4500 of ~48,700 total steps, ~9.2% complete)
Best suited for Kannada instruction-following tasks
May generate incorrect or nonsensical responses for complex queries

Future Work

Complete full 2-epoch training
Merge LoRA weights into base model for easier loading
Evaluate on Kannada benchmarks
Fine-tune larger models (1.8B, 7B)

License

Apache 2.0

Citation

@misc{kannadagpt-0.6b,
  author = {Mithun501},
  title = {KannadaGPT-0.6B: A Kannada Language Model},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/mithun50/KannadaGPT-0.6B}
}

Acknowledgments

Qwen Team for the base model
Cognitive-Lab for the Kannada instruction dataset
Hugging Face for the PEFT library

Author

Mithun501 - GitHub | HuggingFace

Downloads last month: 81

Model tree for Mithun501/KannadaGPT-0.6B

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Adapter

(252)

this model

Mithun501
/

KannadaGPT-0.6B