KannadaGPT-0.6B

A Kannada language model fine-tuned on Qwen3-0.6B using LoRA (Low-Rank Adaptation).

HuggingFace Open In Colab Merge Model License

Model Details

Property Value
Base Model Qwen/Qwen3-0.6B
Language Kannada (ಕನ್ನಡ)
Fine-tuning Method LoRA (Low-Rank Adaptation)
Training Data Cognitive-Lab/Kannada-Instruct-dataset
Training Samples 389,608
Base Parameters 0.6B
Trainable Parameters 2.29M (0.38%)

Training Configuration

Parameter Value
LoRA Rank 16
LoRA Alpha 32
Dropout 0.05
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Learning Rate 2e-4
Batch Size 2 (with gradient accumulation 8)
Epochs 2

Quick Start

Try it in Google Colab: Open In Colab

Installation

pip install transformers peft torch accelerate

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-0.6B",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Mithun501/KannadaGPT-0.6B")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Mithun501/KannadaGPT-0.6B")

# Generate text
messages = [
    {"role": "user", "content": "ಭಾರತದ ರಾಜಧಾನಿ ಯಾವುದು?"}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.8)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Example Prompts

Kannada Prompt English Translation
ಭಾರತದ ರಾಜಧಾನಿ ಯಾವುದು? What is the capital of India?
ಆರೋಗ್ಯವಾಗಿರಲು ಮೂರು ಸಲಹೆಗಳನ್ನು ನೀಡಿ Give three tips for staying healthy
ಕನ್ನಡದಲ್ಲಿ ಕವಿತೆ ಬರೆಯಿರಿ Write a poem in Kannada
ಬೆಂಗಳೂರಿನ ಬಗ್ಗೆ ಹೇಳಿ Tell me about Bangalore
ಮಳೆ ಏಕೆ ಬರುತ್ತದೆ? Why does it rain?

Training Progress

The model was trained on Kaggle with P100 GPU. Training metrics from checkpoint-4500:

Step Loss Learning Rate
50 1.459 6.7e-06
500 0.675 6.8e-05
1000 0.613 1.4e-04
1500 0.572 2.0e-04
2000 0.534 2.0e-04
2500 0.518 2.0e-04
3000 0.502 1.9e-04
3500 0.492 1.9e-04
4000 0.488 1.9e-04
4500 0.470 1.9e-04

Training Progress: 4,500 / 48,702 steps (9.2% complete, epoch 0.185/2.0)

Project Structure

KannadaGPT-0.6B/
├── adapter_config.json        # LoRA configuration
├── adapter_model.safetensors  # LoRA weights (38MB)
├── tokenizer.json             # Tokenizer
├── tokenizer_config.json      # Tokenizer config
├── vocab.json                 # Vocabulary
├── merges.txt                 # BPE merges
├── special_tokens_map.json    # Special tokens
├── added_tokens.json          # Added tokens
├── chat_template.jinja        # Chat template
├── KannadaGPT_Inference.ipynb # Colab inference notebook
├── KannadaGPT_Merge.ipynb     # Colab merge notebook
└── README.md                  # This file

Limitations

  • This is a LoRA adapter and requires the base model (Qwen3-0.6B) to run
  • Training is partial (checkpoint-4500 of ~48,700 total steps, ~9.2% complete)
  • Best suited for Kannada instruction-following tasks
  • May generate incorrect or nonsensical responses for complex queries

Future Work

  • Complete full 2-epoch training
  • Merge LoRA weights into base model for easier loading
  • Evaluate on Kannada benchmarks
  • Fine-tune larger models (1.8B, 7B)

License

Apache 2.0

Citation

@misc{kannadagpt-0.6b,
  author = {Mithun501},
  title = {KannadaGPT-0.6B: A Kannada Language Model},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/mithun50/KannadaGPT-0.6B}
}

Acknowledgments

Author

Mithun501 - GitHub | HuggingFace

Downloads last month
81
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Mithun501/KannadaGPT-0.6B

Finetuned
Qwen/Qwen3-0.6B
Adapter
(252)
this model

Dataset used to train Mithun501/KannadaGPT-0.6B