Model Card for SinLlama News Continued Adapter

This model is a parameter-efficient fine-tuned (PEFT) LoRA adapter designed to classify Sinhala news articles into five distinct categories (Political, Business, Technology, Sports, Entertainment). It transforms a generative base model into a specialized feature extractor for the Sinhala language.

Model Details

Model Description

This adapter was trained by applying Low-Rank Adaptation (LoRA) to the meta-llama/Meta-Llama-3-8B base model, building upon the conversational capabilities of the polyglots/llama-3-8b-si-SinLlama-instruct-67017 adapter. The model uses an extended tokenizer (139,336 tokens) to effectively capture Sinhala morphology, bypassing the standard Llama-3 vocabulary limits.

Training was heavily optimized using the Unsloth AI framework, allowing for efficient 8-bit precision training on consumer-grade hardware.

  • Developed by: Oshada Dilshan (University of Kelaniya)
  • Model type: Transformer Decoder with LoRA Adapter
  • Language(s) (NLP): Sinhala (si), English (en)
  • License: Llama 3 Community License
  • Finetuned from model: meta-llama/Meta-Llama-3-8B

Model Sources

Uses

Direct Use

The primary use case is extracting semantic features from formal Sinhala text and classifying Sinhala news documents into predefined topics (0: Political, 1: Business, 2: Technology, 3: Sports, 4: Entertainment).

Out-of-Scope Use

This model is explicitly tuned for classification and feature extraction using strict instructional prompts. It is not intended to be used as an open-ended conversational chatbot. Furthermore, because the pre-training data excluded "Singlish" (Romanized Sinhala), the model will perform poorly on code-mixed social media text.

Bias, Risks, and Limitations

Consultation with the original developers of the underlying SinLlama architecture revealed a strong training bias toward formal "Book Sinhala" sourced from news and literary corpora. Consequently, the model's representations of colloquial or spoken Sinhala dialects are limited and may lead to inaccurate feature extraction in informal contexts.

How to Get Started with the Model

CRITICAL: Because this model uses the extended SinLlama vocabulary, you must resize the token embeddings of the base model to 139336 before applying this adapter.

from unsloth import FastLanguageModel
from peft import PeftModel
from transformers import AutoTokenizer

base_model_name = "meta-llama/Meta-Llama-3-8B"
adapter_name = "oshada99/sinllama-news-continued-adapter"
max_seq_length = 2048

# 1. Load base model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=base_model_name,
    max_seq_length=max_seq_length,
    dtype=None,
    load_in_4bit=True,
)

# 2. Resize vocab to prevent tensor mismatch ("Vocabulary Wall")
model.resize_token_embeddings(139336)

# 3. Load the fine-tuned adapter
model = PeftModel.from_pretrained(model, adapter_name)
tokenizer = AutoTokenizer.from_pretrained(adapter_name)

# 4. Inference
prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Classify this Sinhala news article into one category only.

### Input:
ශ්‍රී ලංකා කණ්ඩායම අද තරගයේ ජයග්‍රහණය ලබා ගත්තා.

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=32, do_sample=False)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# Expected Output: 3
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for oshada99/sinllama-news-continued-adapter

Adapter
(722)
this model