Shaer-adapters / README.md
AhmadAbbass's picture
Upload README.md with huggingface_hub
0df03ee verified
---
language:
- ar
license: apache-2.0
base_model: Navid-AI/Yehia-7B-preview
tags:
- peft
- qlora
- arabic
- poetry
- classical-arabic-poetry
- meter-conditioned-generation
pipeline_tag: text-generation
---
# Shaer Main SFT Adapters
This repository stores the completed main SFT baseline for the Shaer classical Arabic poetry project.
## Baseline Summary
- Base model: `Navid-AI/Yehia-7B-preview`
- Dataset: `Shaer-AI/ashaar-with-enhanced-descriptions-baseform-final-sft-lte20-min500-splits`
- Split policy: deterministic `94 / 3 / 3`, stratified by `base_meter||form||length_bucket`
- Train sampler: weighted train-only sampler on `base_meter||form||length_bucket`
- LoRA: `all-linear`, `r=64`, `alpha=128`, `dropout=0.05`, `use_rslora=true`
- Run name: `train_20260407_231929`
- Best eval checkpoint: `/root/workspace/Shaer/sft/outputs/train/train_20260407_231929/checkpoint-3000`
- Best eval loss: `2.2074480056762695`
- Final test loss: `2.1932055950164795`
- Best probe meter mean: `0.6042156156147234` at step `2800`
- Final probe meter mean: `0.5087050689648603`
- Final probe count adherence mean: `1.0`
## Important Paths In This Repo
- Latest adapter export:
- `adapters/fresh_sft/train/latest`
- Best adapter export:
- `adapters/fresh_sft/train/best`
- Finished run report bundle:
- `reports/fresh_sft/train_20260407_231929`
## Current Comparison Context
- The short meter-loss continuation in `Shaer-AI/shaer-adapters-v2` was stopped early after the auxiliary meter head improved but CE and probe meter drifted worse than this baseline.
- A later fresh-from-start `v3` run was deleted after confirming the current meter-loss path was invalid because it could read the requested meter from prompt-conditioned hidden states.
- This baseline remains the strongest known safe reference until a corrected challenger clearly beats it.