Model Description

An instruction-tuned Sinhala language model fine-tuned on Minuri/sinhala-llama-1b-corpus-random (randomly sampled continually pretrained LLaMA 3.2 1B). Part of a diversity-driven Sinhala language model adaptation study.

SFT model variants in this series:

  • Minuri/sinhala-llama-1b-sft-baseline - SFT on base LLaMA 3.2 1B (no CPT)
  • Minuri/sinhala-llama-1b-sft-news - SFT on sinhala-llama-1b-corpus-news (news-only CPT)
  • Minuri/sinhala-llama-1b-sft-random - SFT on sinhala-llama-1b-corpus-random - this repo
  • Minuri/sinhala-llama-1b-sft-diverse - SFT on sinhala-llama-1b-corpus-diverse (diversity-optimised CPT)

This model is the result of supervised fine-tuning (SFT) of Minuri/sinhala-llama-1b-corpus-random on the Minuri/sinhala-sft-dataset (~213K Sinhala instruction pairs). The Minuri/sinhala-llama-1b-corpus-diverse was continually pretrained on a randomly sampled Sinhala corpus prior to SFT.

Training Data

Dataset Description
Minuri/sinhala-sft-dataset ~213K Sinhala instruction pairs merged from three source datasets

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained(`Minuri/sinhala-llama-1b-sft-random`)
model = AutoModelForCausalLM.from_pretrained(`Minuri/sinhala-llama-1b-sft-random`)

Intended Uses

  • Sinhala instruction following
  • Random sampling CPT+SFT ablation baseline
  • Low-resource NLP research

Limitations

  • 1B parameter model with limited reasoning capability

Related Repositories

Repo Description
Minuri/sinhala-llama-1b-corpus-random Base model
Minuri/sinhala-sft-dataset SFT training dataset (~213K pairs)
Minuri/sinhala-llama-3.2-1b-tokenizer Extended Sinhala tokenizer
Minuri/sinhala-llama-1b-sft-baseline SFT baseline
Minuri/sinhala-llama-1b-sft-news SFT on sinhala-llama-1b-corpus-news model
Minuri/sinhala-llama-1b-sft-diverse SFT on sinhala-llama-1b-corpus-diverse model

License

This model is derived from meta-llama/Llama-3.2-1B and is subject to the LLaMA 3.2 Community License.

Downloads last month
21
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Minuri/sinhala-llama-1b-sft-random

Finetuned
(1)
this model