Shaer Main SFT Adapters
This repository stores the completed main SFT baseline for the Shaer classical Arabic poetry project.
Baseline Summary
- Base model:
Navid-AI/Yehia-7B-preview - Dataset:
Shaer-AI/ashaar-with-enhanced-descriptions-baseform-final-sft-lte20-min500-splits - Split policy: deterministic
94 / 3 / 3, stratified bybase_meter||form||length_bucket - Train sampler: weighted train-only sampler on
base_meter||form||length_bucket - LoRA:
all-linear,r=64,alpha=128,dropout=0.05,use_rslora=true - Run name:
train_20260407_231929 - Best eval checkpoint:
/root/workspace/Shaer/sft/outputs/train/train_20260407_231929/checkpoint-3000 - Best eval loss:
2.2074480056762695 - Final test loss:
2.1932055950164795 - Best probe meter mean:
0.6042156156147234at step2800 - Final probe meter mean:
0.5087050689648603 - Final probe count adherence mean:
1.0
Important Paths In This Repo
- Latest adapter export:
adapters/fresh_sft/train/latest
- Best adapter export:
adapters/fresh_sft/train/best
- Finished run report bundle:
reports/fresh_sft/train_20260407_231929
Current Comparison Context
- The short meter-loss continuation in
Shaer-AI/shaer-adapters-v2was stopped early after the auxiliary meter head improved but CE and probe meter drifted worse than this baseline. - A later fresh-from-start
v3run was deleted after confirming the current meter-loss path was invalid because it could read the requested meter from prompt-conditioned hidden states. - This baseline remains the strongest known safe reference until a corrected challenger clearly beats it.
- Downloads last month
- -
Model tree for Shaer-AI/Shaer-adapters
Base model
humain-ai/ALLaM-7B-Instruct-preview Finetuned
Navid-AI/Yehia-7B-preview