Jais-1.3B LoRA โ Gulf Arabic Sentiment Analysis
A LoRA adapter fine-tuned on Jais-1.3B for 3-class sentiment analysis (Negative, Neutral, Positive) in Gulf Arabic code-switched text (Arabic mixed with English).
Performance
| Metric | Score |
|---|---|
| Accuracy | 89.20% |
| F1 (macro) | 86.06% |
| Precision | 86.27% |
| Recall | 85.91% |
| DSFS (cultural) | 72.5% |
The Dialectal Sentiment Fidelity Score (DSFS) evaluates cultural understanding across Gulf expressions (70%), code-switching patterns (70%), and culturally ambiguous/sarcastic phrases (80%).
Usage
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from peft import PeftModel
base_model = "inceptionai/jais-family-1p3b"
adapter = "ziyanhashim/jais-lora-gulf-arabic-sentiment"
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True, token="hf_xxx")
base = AutoModelForSequenceClassification.from_pretrained(
base_model, num_labels=3, torch_dtype=torch.float32,
trust_remote_code=True, ignore_mismatched_sizes=True, token="hf_xxx"
)
model = PeftModel.from_pretrained(base, adapter).eval()
text = "ูุงูู
ุทุนู
ูุงูุฏ ุญูู the food is amazing"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
probs = torch.softmax(model(**inputs).logits, dim=-1)[0]
labels = {0: "Negative", 1: "Neutral", 2: "Positive"}
print(f"{labels[probs.argmax().item()]}: {probs.max():.2%}")
Note: The base model
inceptionai/jais-family-1p3bis gated โ you need a HuggingFace token with access.
Training Details
| Parameter | Value |
|---|---|
| Base model | inceptionai/jais-family-1p3b (1.3B params) |
| LoRA rank (r) | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0.1 |
| Target modules | c_attn, c_proj, c_fc, c_fc2 |
| Trainable params | 13.4M (~0.96% of total) |
| Epochs | 5 |
| Batch size | 16 |
| Learning rate | 2e-5 |
| Training time | ~3.2 hours |
| Peak GPU memory | 15.45 GB |
Dataset
~166K samples from multiple Arabic sentiment sources:
- Arabic Sentiment Twitter Corpus (~58K tweets)
- LABR: Large Arabic Book Reviews (~63K)
- HARD: Hotel Arabic Reviews Dataset
- 60 synthetic Gulf Arabic code-switched examples
Labels: Negative (0), Neutral (1), Positive (2)
Developed By
Ziyan Hashim & Hani Moustafa โ Group big_boyz CSCI316 Big Data Mining & Applications, University of Wollongong in Dubai
Framework Versions
- PEFT 0.18.1
- Transformers 4.44.0
- PyTorch 2.0+
- Downloads last month
- 2
Model tree for ziyanhashim/jais-lora-gulf-arabic-sentiment
Base model
inceptionai/jais-family-1p3b