Shaer Main SFT Adapters

This repository stores the completed main SFT baseline for the Shaer classical Arabic poetry project.

Baseline Summary

  • Base model: Navid-AI/Yehia-7B-preview
  • Dataset: Shaer-AI/ashaar-with-enhanced-descriptions-baseform-final-sft-lte20-min500-splits
  • Split policy: deterministic 94 / 3 / 3, stratified by base_meter||form||length_bucket
  • Train sampler: weighted train-only sampler on base_meter||form||length_bucket
  • LoRA: all-linear, r=64, alpha=128, dropout=0.05, use_rslora=true
  • Run name: train_20260407_231929
  • Best eval checkpoint: /root/workspace/Shaer/sft/outputs/train/train_20260407_231929/checkpoint-3000
  • Best eval loss: 2.2074480056762695
  • Final test loss: 2.1932055950164795
  • Best probe meter mean: 0.6042156156147234 at step 2800
  • Final probe meter mean: 0.5087050689648603
  • Final probe count adherence mean: 1.0

Important Paths In This Repo

  • Latest adapter export:
    • adapters/fresh_sft/train/latest
  • Best adapter export:
    • adapters/fresh_sft/train/best
  • Finished run report bundle:
    • reports/fresh_sft/train_20260407_231929

Current Comparison Context

  • The short meter-loss continuation in Shaer-AI/shaer-adapters-v2 was stopped early after the auxiliary meter head improved but CE and probe meter drifted worse than this baseline.
  • A later fresh-from-start v3 run was deleted after confirming the current meter-loss path was invalid because it could read the requested meter from prompt-conditioned hidden states.
  • This baseline remains the strongest known safe reference until a corrected challenger clearly beats it.
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Shaer-AI/Shaer-adapters

Adapter
(6)
this model