Shaer-adapters / README.md
AhmadAbbass's picture
Upload README.md with huggingface_hub
0df03ee verified
metadata
language:
  - ar
license: apache-2.0
base_model: Navid-AI/Yehia-7B-preview
tags:
  - peft
  - qlora
  - arabic
  - poetry
  - classical-arabic-poetry
  - meter-conditioned-generation
pipeline_tag: text-generation

Shaer Main SFT Adapters

This repository stores the completed main SFT baseline for the Shaer classical Arabic poetry project.

Baseline Summary

  • Base model: Navid-AI/Yehia-7B-preview
  • Dataset: Shaer-AI/ashaar-with-enhanced-descriptions-baseform-final-sft-lte20-min500-splits
  • Split policy: deterministic 94 / 3 / 3, stratified by base_meter||form||length_bucket
  • Train sampler: weighted train-only sampler on base_meter||form||length_bucket
  • LoRA: all-linear, r=64, alpha=128, dropout=0.05, use_rslora=true
  • Run name: train_20260407_231929
  • Best eval checkpoint: /root/workspace/Shaer/sft/outputs/train/train_20260407_231929/checkpoint-3000
  • Best eval loss: 2.2074480056762695
  • Final test loss: 2.1932055950164795
  • Best probe meter mean: 0.6042156156147234 at step 2800
  • Final probe meter mean: 0.5087050689648603
  • Final probe count adherence mean: 1.0

Important Paths In This Repo

  • Latest adapter export:
    • adapters/fresh_sft/train/latest
  • Best adapter export:
    • adapters/fresh_sft/train/best
  • Finished run report bundle:
    • reports/fresh_sft/train_20260407_231929

Current Comparison Context

  • The short meter-loss continuation in Shaer-AI/shaer-adapters-v2 was stopped early after the auxiliary meter head improved but CE and probe meter drifted worse than this baseline.
  • A later fresh-from-start v3 run was deleted after confirming the current meter-loss path was invalid because it could read the requested meter from prompt-conditioned hidden states.
  • This baseline remains the strongest known safe reference until a corrected challenger clearly beats it.