manthilaffs's picture
Update README.md
4c1e030 verified
---
license: apache-2.0
language:
- si
- en
base_model:
- google/gemma-3-4b-pt
pipeline_tag: text-generation
tags:
- instruction-following
- NLP
- question-answering
- reasoning
- academic
- maths
- LK
citations:
- style: apa
citation: |
Please cite as: Mallawa, M. (2025). *Gamunu-Instruct-4B-Alpha: A Sinhala-centric bilingual instruction-tuned language model.* The Gamunu Project. Available at https://huggingface.co/manthilaffs/Gamunu-Instruct-4B-Alpha
- style: bibtex
citation: |
@misc{mallawa_gamunu_instruct_4b_alpha_2025,
author = {Mallawa, Manthila},
title = {Gamunu-Instruct-4B-Alpha: A Sinhala-centric bilingual instruction-tuned language model},
year = {2025},
publisher = {The Gamunu Project},
howpublished = {\url{https://huggingface.co/manthilaffs/Gamunu-Instruct-4B-Alpha}}
}
---
## Gamunu-4b-Instruct-Alpha
**සිංහල instruct LLM — Experimental Release**
Gamunu-4b-Instruct-Alpha is the first experimental checkpoint of the Gamunu Project, a Sinhala-centric bilingual Large Language Model. Built through continued pre-training on Sinhala-rich academic and domain-specific data, it's fine-tuned for instruction following, reasoning, and culturally grounded interactions.
> ⚠️ **Alpha Notice**
> This is an *experimental research model.*
> It demonstrates strong Sinhala fluency, reasoning, and broad NLP coverage — but is **single-turn only** and **not yet RLHF-aligned** for multi-turn dialogue.
> Use for **research, benchmarking, and controlled deployments — not production.**
<!-- *Developed by Manthila Mallawa* -->
### 🧪 Live Demo
Now you can try **Gamunu-4b-Instruct-Alpha** instantly on Hugging Face Spaces for free 👇
🔗 [**Gamunu ZeroGPU Demo**](https://huggingface.co/spaces/manthilaffs/Gamunu-Inference)
<iframe
src="https://manthilaffs-gamunu-inference.hf.space"
frameborder="0"
width="850"
height="450"
></iframe>
---
## ⚡ Capabilities
### 🔤 Language & Reasoning
- Fluent, idiomatic Sinhala generation
- Robust Sinhala ↔ English bilingual understanding
- Solid mathematical reasoning (percentages, word problems, arithmetic)
- Logical, step-by-step reasoning in QA tasks
- Structured, concise, and context-aware responses
### 🎭 Roleplay & Instruction
- Accurate adherence to single-turn instructions
- Expert persona simulation (teacher, scientist, analyst, advisor)
- Balanced, formal, and culturally aware tone
### 🧩 Supported NLP Tasks
- Text generation & completion
- Summarization (educational / contextual)
- Translation (Sinhala ↔ English)
- Paraphrasing and rewriting
- Question answering (factoid + reasoning)
- Instruction-based classification
- Role-specific expert responses
---
## 🚫 Limitations
- No conversational memory
- Occasional factual drift
- No RLHF or safety tuning yet
- Reasoning quality may degrade with ambiguous prompts
---
## 🎯 Intended Use
**Best for**
- Research & evaluation of Sinhala LLMs
- Educational assistants and analytical Q&A
- Cultural, marketing, and academic content generation
- Benchmarking instruction following in low-resource languages
**Not for**
- Medical, legal, or financial decision-making
- Production systems requiring factual reliability
- Processing sensitive or personal data
---
## 🧩 Training Details
### Phase 1 – Continued Pre-training (CPT)
Focused on enhancing Sinhala linguistic coverage and contextual understanding for semantic depth.
### Phase 2 – Supervised Fine-tuning (SFT)
Fine-tuned on a **custom Sinhala instruction dataset** emphasizing reasoning, roleplay, and assistant-style behavior.
| Setting | Value |
|----------|-------|
| **Framework** | Unsloth + Transformers |
| **Optimizer** | AdamW + cosine scheduler |
| **Hardware** | NVIDIA H100 (80 GB) |
| **Epochs** | 5 |
| **LoRA Rank / α / Dropout** | 128 / 128 / 0.05 |
---
## 📋 Model Summary
| Property | Description |
|-----------|-------------|
| **Stage** | Alpha (Experimental) |
| **Pipeline** | CPT → Custom SFT (LoRA) |
| **Base Model** | Google Gemma 3 4B |
| **Languages** | Sinhala (primary), English (secondary) |
| **Dialogue Type** | Single-turn instruction |
| **Context Length** | 2048 tokens |
---
## 🧩 Base Model License
This model was fine-tuned from **Google Gemma 3 4B**, distributed under the
[Gemma Terms of Use](https://ai.google.dev/gemma/terms).
All rights to Gemma 3 4B remain with **Google LLC**.
The **Gamunu-Instruct-4B-Alpha** weights, datasets, and training code are released by
**Manthila Mallawa (The Gamunu Project)** under the **Apache 2.0 License**.
Use of the base model remains subject to Google's policies.
---
## 💬 Example Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load model and tokenizer
model_name = "manthilaffs/Gamunu-4B-Instruct-Alpha"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
device_map="auto"
)
# Sinhala prompt template
sinhala_prompt = """පහත දැක්වෙන්නේ යම් කාර්යයක් පිළිබඳ විස්තර කරන උපදෙසක් සහ එයට අදාළ තොරතුරු ඇතුළත් ආදානයකි. ඉල්ලූ කාර්යය නිවැරදිව සම්පූර්ණ කළ හැකි ප්‍රතිචාරයක් සපයන්න.
### උපදෙස:
ඔබ ගැමුණු (Gamunu) නම් AI සහායකයායි.
ඔබේ කාර්යය වන්නේ පරිශීලකයන්ගේ උපදෙස් නිවැරදිව පිලිපැදීම හා අසා ඇති ප්‍රශ්නවලට නිවැරදිව පිළිතුරු සපයමින් ඔවුන්ට සහය වීමයි.
### ආදානය:
{}
### ප්‍රතිචාරය:
{}"""
# Example input
user_query = "හෙලෝ ගැමුණු! මම සමන්, ඔයාට කොහොමද?"
prompt = sinhala_prompt.format(user_query, "")
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Generate
with torch.inference_mode():
outputs = model.generate(**inputs, max_new_tokens=250)
# Decode and clean output
text = tokenizer.decode(outputs[0], skip_special_tokens=True)
if "### ප්‍රතිචාරය:" in text:
text = text.split("### ප්‍රතිචාරය:")[-1].strip()
print(text)
```
---
## 🧾 How to Cite
If you use **Gamunu-Instruct-4B-Alpha** in your work, please cite as follows:
**APA**
> Mallawa, M. (2025). *Gamunu-Instruct-4B-Alpha: A Sinhala-centric bilingual instruction-tuned language model.* The Gamunu Project. Retrieved from [https://huggingface.co/manthilaffs/Gamunu-Instruct-4B-Alpha](https://huggingface.co/manthilaffs/Gamunu-Instruct-4B-Alpha)
**BibTeX**
```bibtex
@misc{mallawa_gamunu_instruct_4b_alpha_2025,
author = {Mallawa, Manthila},
title = {Gamunu-Instruct-4B-Alpha: A Sinhala-centric bilingual instruction-tuned language model},
year = {2025},
publisher = {The Gamunu Project},
howpublished = {\url{https://huggingface.co/manthilaffs/Gamunu-Instruct-4B-Alpha}}
}
```