Shaer Main SFT Adapters

This repository stores the completed main SFT baseline for the Shaer classical Arabic poetry project.

Baseline Summary

Base model: Navid-AI/Yehia-7B-preview
Dataset: Shaer-AI/ashaar-with-enhanced-descriptions-baseform-final-sft-lte20-min500-splits
Split policy: deterministic 94 / 3 / 3, stratified by base_meter||form||length_bucket
Train sampler: weighted train-only sampler on base_meter||form||length_bucket
LoRA: all-linear, r=64, alpha=128, dropout=0.05, use_rslora=true
Run name: train_20260407_231929
Best eval checkpoint: /root/workspace/Shaer/sft/outputs/train/train_20260407_231929/checkpoint-3000
Best eval loss: 2.2074480056762695
Final test loss: 2.1932055950164795
Best probe meter mean: 0.6042156156147234 at step 2800
Final probe meter mean: 0.5087050689648603
Final probe count adherence mean: 1.0

The short meter-loss continuation in Shaer-AI/shaer-adapters-v2 was stopped early after the auxiliary meter head improved but CE and probe meter drifted worse than this baseline.
A later fresh-from-start v3 run was deleted after confirming the current meter-loss path was invalid because it could read the requested meter from prompt-conditioned hidden states.
This baseline remains the strongest known safe reference until a corrected challenger clearly beats it.

Base model

Finetuned

Adapter

(13)

this model