Shaer-AI
/

Shaer-adapters

Text Generation

classical-arabic-poetry

meter-conditioned-generation

Model card Files Files and versions

Shaer-adapters / README.md

AhmadAbbass's picture

Upload README.md with huggingface_hub

0df03ee verified 3 days ago

|

history blame contribute delete

1.86 kB

	---
	language:
	- ar
	license: apache-2.0
	base_model: Navid-AI/Yehia-7B-preview
	tags:
	- peft
	- qlora
	- arabic
	- poetry
	- classical-arabic-poetry
	- meter-conditioned-generation
	pipeline_tag: text-generation
	---

	# Shaer Main SFT Adapters

	This repository stores the completed main SFT baseline for the Shaer classical Arabic poetry project.

	## Baseline Summary

	- Base model: `Navid-AI/Yehia-7B-preview`
	- Dataset: `Shaer-AI/ashaar-with-enhanced-descriptions-baseform-final-sft-lte20-min500-splits`
	- Split policy: deterministic `94 / 3 / 3`, stratified by `base_meter\|\|form\|\|length_bucket`
	- Train sampler: weighted train-only sampler on `base_meter\|\|form\|\|length_bucket`
	- LoRA: `all-linear`, `r=64`, `alpha=128`, `dropout=0.05`, `use_rslora=true`
	- Run name: `train_20260407_231929`
	- Best eval checkpoint: `/root/workspace/Shaer/sft/outputs/train/train_20260407_231929/checkpoint-3000`
	- Best eval loss: `2.2074480056762695`
	- Final test loss: `2.1932055950164795`
	- Best probe meter mean: `0.6042156156147234` at step `2800`
	- Final probe meter mean: `0.5087050689648603`
	- Final probe count adherence mean: `1.0`

	## Important Paths In This Repo

	- Latest adapter export:
	- `adapters/fresh_sft/train/latest`
	- Best adapter export:
	- `adapters/fresh_sft/train/best`
	- Finished run report bundle:
	- `reports/fresh_sft/train_20260407_231929`

	## Current Comparison Context

	- The short meter-loss continuation in `Shaer-AI/shaer-adapters-v2` was stopped early after the auxiliary meter head improved but CE and probe meter drifted worse than this baseline.
	- A later fresh-from-start `v3` run was deleted after confirming the current meter-loss path was invalid because it could read the requested meter from prompt-conditioned hidden states.
	- This baseline remains the strongest known safe reference until a corrected challenger clearly beats it.