Qwen3-8B Sindhi CPT (Continued Pre-Training)
This is a LoRA adapter for Qwen3-8B, continued pre-trained on ~164M tokens of Sindhi text.
Model Details
| Property | Value |
|---|---|
| Base Model | unsloth/Qwen3-8B-bnb-4bit |
| Training Type | Continued Pre-Training (CPT) |
| Training Tokens | ~164M Sindhi tokens |
| LoRA Rank | 32 |
| LoRA Alpha | 64 |
| Sequence Length | 2048 |
| Quantization | 4-bit (bnb) |
| Framework | Unsloth + HuggingFace PEFT |
Usage
Option 1 β Load with Unsloth (recommended, faster)
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "hellosindh/qwen3-sindhi-cpt",
load_in_4bit = True,
max_seq_length = 2048,
)
# Enable fast inference
FastLanguageModel.for_inference(model)
Option 2 β Load base + adapter separately with PEFT
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3-8B",
torch_dtype = torch.bfloat16,
device_map = "auto",
load_in_4bit = True,
)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("hellosindh/qwen3-sindhi-cpt")
# Apply Sindhi adapter on top
model = PeftModel.from_pretrained(base_model, "hellosindh/qwen3-sindhi-cpt")
Generate Sindhi text
inputs = tokenizer("Ψ³ΩΪ Ψ¬Ω Ω
Ψ§Ϊ»ΩΩ", return_tensors="pt").to("cuda")
outputs = model.generate(
**inputs,
max_new_tokens = 200,
temperature = 0.8,
do_sample = True,
repetition_penalty = 1.1,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
- Dataset: ~164M Sindhi tokens from multiple sources
- Tokenizer: Qwen3 original tokenizer (no modifications)
- Hardware: NVIDIA A100 40GB
- Framework: Unsloth for efficient training
- Optimizer: AdamW 8-bit
- Learning Rate:
5e-5with cosine scheduler - Final Loss: ~1.20
Intended Use
- Sindhi text generation
- Synthetic data generation for low-resource Sindhi NLP
- Base for further fine-tuning on Sindhi tasks (NER, QA, summarization)
- Pretraining data augmentation for encoder models like SindhiBERT
Limitations
- This is a continued pre-training adapter, not an instruction-tuned model
- Outputs may not be factually accurate β intended for linguistic pattern learning
- Best used as a base for task-specific fine-tuning
- Downloads last month
- 22
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support