Shaer-adapters / README.md

AhmadAbbass

Upload README.md with huggingface_hub

0df03ee verified 3 days ago

preview code

raw

history blame contribute delete

1.86 kB

metadata

language:
  - ar
license: apache-2.0
base_model: Navid-AI/Yehia-7B-preview
tags:
  - peft
  - qlora
  - arabic
  - poetry
  - classical-arabic-poetry
  - meter-conditioned-generation
pipeline_tag: text-generation

Shaer Main SFT Adapters

This repository stores the completed main SFT baseline for the Shaer classical Arabic poetry project.

Baseline Summary

Base model: Navid-AI/Yehia-7B-preview
Dataset: Shaer-AI/ashaar-with-enhanced-descriptions-baseform-final-sft-lte20-min500-splits
Split policy: deterministic 94 / 3 / 3, stratified by base_meter||form||length_bucket
Train sampler: weighted train-only sampler on base_meter||form||length_bucket
LoRA: all-linear, r=64, alpha=128, dropout=0.05, use_rslora=true
Run name: train_20260407_231929
Best eval checkpoint: /root/workspace/Shaer/sft/outputs/train/train_20260407_231929/checkpoint-3000
Best eval loss: 2.2074480056762695
Final test loss: 2.1932055950164795
Best probe meter mean: 0.6042156156147234 at step 2800
Final probe meter mean: 0.5087050689648603
Final probe count adherence mean: 1.0

Important Paths In This Repo

Latest adapter export:
- adapters/fresh_sft/train/latest
Best adapter export:
- adapters/fresh_sft/train/best
Finished run report bundle:
- reports/fresh_sft/train_20260407_231929

Current Comparison Context

The short meter-loss continuation in Shaer-AI/shaer-adapters-v2 was stopped early after the auxiliary meter head improved but CE and probe meter drifted worse than this baseline.
A later fresh-from-start v3 run was deleted after confirming the current meter-loss path was invalid because it could read the requested meter from prompt-conditioned hidden states.
This baseline remains the strongest known safe reference until a corrected challenger clearly beats it.