How to use from the
Use from the
PEFT library
Task type is invalid.

Shaer Main SFT Adapters

This repository stores the completed main SFT baseline for the Shaer classical Arabic poetry project.

Baseline Summary

  • Base model: Navid-AI/Yehia-7B-preview
  • Dataset: Shaer-AI/ashaar-with-enhanced-descriptions-baseform-final-sft-lte20-min500-splits
  • Split policy: deterministic 94 / 3 / 3, stratified by base_meter||form||length_bucket
  • Train sampler: weighted train-only sampler on base_meter||form||length_bucket
  • LoRA: all-linear, r=64, alpha=128, dropout=0.05, use_rslora=true
  • Run name: train_20260407_231929
  • Best eval checkpoint: /root/workspace/Shaer/sft/outputs/train/train_20260407_231929/checkpoint-3000
  • Best eval loss: 2.2074480056762695
  • Final test loss: 2.1932055950164795
  • Best probe meter mean: 0.6042156156147234 at step 2800
  • Final probe meter mean: 0.5087050689648603
  • Final probe count adherence mean: 1.0

Important Paths In This Repo

  • Latest adapter export:
    • adapters/fresh_sft/train/latest
  • Best adapter export:
    • adapters/fresh_sft/train/best
  • Finished run report bundle:
    • reports/fresh_sft/train_20260407_231929

Current Comparison Context

  • The short meter-loss continuation in Shaer-AI/shaer-adapters-v2 was stopped early after the auxiliary meter head improved but CE and probe meter drifted worse than this baseline.
  • A later fresh-from-start v3 run was deleted after confirming the current meter-loss path was invalid because it could read the requested meter from prompt-conditioned hidden states.
  • This baseline remains the strongest known safe reference until a corrected challenger clearly beats it.
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Shaer-AI/Shaer-adapters

Adapter
(13)
this model