Update README.md

4c1e030 verified 2 months ago

7.43 kB

	---
	license: apache-2.0
	language:
	- si
	- en
	base_model:
	- google/gemma-3-4b-pt
	pipeline_tag: text-generation
	tags:
	- instruction-following
	- NLP
	- question-answering
	- reasoning
	- academic
	- maths
	- LK
	citations:
	- style: apa
	citation: \|
	Please cite as: Mallawa, M. (2025). Gamunu-Instruct-4B-Alpha: A Sinhala-centric bilingual instruction-tuned language model. The Gamunu Project. Available at https://huggingface.co/manthilaffs/Gamunu-Instruct-4B-Alpha
	- style: bibtex
	citation: \|
	@misc{mallawa_gamunu_instruct_4b_alpha_2025,
	author = {Mallawa, Manthila},
	title = {Gamunu-Instruct-4B-Alpha: A Sinhala-centric bilingual instruction-tuned language model},
	year = {2025},
	publisher = {The Gamunu Project},
	howpublished = {\url{https://huggingface.co/manthilaffs/Gamunu-Instruct-4B-Alpha}}
	}
	---

	## Gamunu-4b-Instruct-Alpha
	සිංහල instruct LLM — Experimental Release

	Gamunu-4b-Instruct-Alpha is the first experimental checkpoint of the Gamunu Project, a Sinhala-centric bilingual Large Language Model. Built through continued pre-training on Sinhala-rich academic and domain-specific data, it's fine-tuned for instruction following, reasoning, and culturally grounded interactions.

	> ⚠️ Alpha Notice
	> This is an experimental research model.
	> It demonstrates strong Sinhala fluency, reasoning, and broad NLP coverage — but is single-turn only and not yet RLHF-aligned for multi-turn dialogue.
	> Use for research, benchmarking, and controlled deployments — not production.

	<!-- Developed by Manthila Mallawa -->
	### 🧪 Live Demo
	Now you can try Gamunu-4b-Instruct-Alpha instantly on Hugging Face Spaces for free 👇

	🔗 [Gamunu ZeroGPU Demo](https://huggingface.co/spaces/manthilaffs/Gamunu-Inference)

	<iframe
	src="https://manthilaffs-gamunu-inference.hf.space"
	frameborder="0"
	width="850"
	height="450"
	></iframe>


	---

	## ⚡ Capabilities

	### 🔤 Language & Reasoning
	- Fluent, idiomatic Sinhala generation
	- Robust Sinhala ↔ English bilingual understanding
	- Solid mathematical reasoning (percentages, word problems, arithmetic)
	- Logical, step-by-step reasoning in QA tasks
	- Structured, concise, and context-aware responses

	### 🎭 Roleplay & Instruction
	- Accurate adherence to single-turn instructions
	- Expert persona simulation (teacher, scientist, analyst, advisor)
	- Balanced, formal, and culturally aware tone

	### 🧩 Supported NLP Tasks
	- Text generation & completion
	- Summarization (educational / contextual)
	- Translation (Sinhala ↔ English)
	- Paraphrasing and rewriting
	- Question answering (factoid + reasoning)
	- Instruction-based classification
	- Role-specific expert responses

	---

	## 🚫 Limitations

	- No conversational memory
	- Occasional factual drift
	- No RLHF or safety tuning yet
	- Reasoning quality may degrade with ambiguous prompts

	---

	## 🎯 Intended Use

	Best for
	- Research & evaluation of Sinhala LLMs
	- Educational assistants and analytical Q&A
	- Cultural, marketing, and academic content generation
	- Benchmarking instruction following in low-resource languages

	Not for
	- Medical, legal, or financial decision-making
	- Production systems requiring factual reliability
	- Processing sensitive or personal data

	---

	## 🧩 Training Details

	### Phase 1 – Continued Pre-training (CPT)
	Focused on enhancing Sinhala linguistic coverage and contextual understanding for semantic depth.

	### Phase 2 – Supervised Fine-tuning (SFT)
	Fine-tuned on a custom Sinhala instruction dataset emphasizing reasoning, roleplay, and assistant-style behavior.

	\| Setting \| Value \|
	\|----------\|-------\|
	\| Framework \| Unsloth + Transformers \|
	\| Optimizer \| AdamW + cosine scheduler \|
	\| Hardware \| NVIDIA H100 (80 GB) \|
	\| Epochs \| 5 \|
	\| LoRA Rank / α / Dropout \| 128 / 128 / 0.05 \|

	---

	## 📋 Model Summary

	\| Property \| Description \|
	\|-----------\|-------------\|
	\| Stage \| Alpha (Experimental) \|
	\| Pipeline \| CPT → Custom SFT (LoRA) \|
	\| Base Model \| Google Gemma 3 4B \|
	\| Languages \| Sinhala (primary), English (secondary) \|
	\| Dialogue Type \| Single-turn instruction \|
	\| Context Length \| 2048 tokens \|

	---

	## 🧩 Base Model License

	This model was fine-tuned from Google Gemma 3 4B, distributed under the
	[Gemma Terms of Use](https://ai.google.dev/gemma/terms).

	All rights to Gemma 3 4B remain with Google LLC.
	The Gamunu-Instruct-4B-Alpha weights, datasets, and training code are released by
	Manthila Mallawa (The Gamunu Project) under the Apache 2.0 License.
	Use of the base model remains subject to Google's policies.

	---

	## 💬 Example Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	# Load model and tokenizer
	model_name = "manthilaffs/Gamunu-4B-Instruct-Alpha"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
	device_map="auto"
	)

	# Sinhala prompt template
	sinhala_prompt = """පහත දැක්වෙන්නේ යම් කාර්යයක් පිළිබඳ විස්තර කරන උපදෙසක් සහ එයට අදාළ තොරතුරු ඇතුළත් ආදානයකි. ඉල්ලූ කාර්යය නිවැරදිව සම්පූර්ණ කළ හැකි ප්‍රතිචාරයක් සපයන්න.
	### උපදෙස:
	ඔබ ගැමුණු (Gamunu) නම් AI සහායකයායි.
	ඔබේ කාර්යය වන්නේ පරිශීලකයන්ගේ උපදෙස් නිවැරදිව පිලිපැදීම හා අසා ඇති ප්‍රශ්නවලට නිවැරදිව පිළිතුරු සපයමින් ඔවුන්ට සහය වීමයි.
	### ආදානය:
	{}
	### ප්‍රතිචාරය:
	{}"""

	# Example input
	user_query = "හෙලෝ ගැමුණු! මම සමන්, ඔයාට කොහොමද?"

	prompt = sinhala_prompt.format(user_query, "")
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	# Generate
	with torch.inference_mode():
	outputs = model.generate(**inputs, max_new_tokens=250)

	# Decode and clean output
	text = tokenizer.decode(outputs[0], skip_special_tokens=True)
	if "### ප්‍රතිචාරය:" in text:
	text = text.split("### ප්‍රතිචාරය:")[-1].strip()

	print(text)

	```
	---

	## 🧾 How to Cite

	If you use Gamunu-Instruct-4B-Alpha in your work, please cite as follows:

	APA
	> Mallawa, M. (2025). Gamunu-Instruct-4B-Alpha: A Sinhala-centric bilingual instruction-tuned language model. The Gamunu Project. Retrieved from [https://huggingface.co/manthilaffs/Gamunu-Instruct-4B-Alpha](https://huggingface.co/manthilaffs/Gamunu-Instruct-4B-Alpha)

	BibTeX
	```bibtex
	@misc{mallawa_gamunu_instruct_4b_alpha_2025,
	author = {Mallawa, Manthila},
	title = {Gamunu-Instruct-4B-Alpha: A Sinhala-centric bilingual instruction-tuned language model},
	year = {2025},
	publisher = {The Gamunu Project},
	howpublished = {\url{https://huggingface.co/manthilaffs/Gamunu-Instruct-4B-Alpha}}
	}
	```