powrin/qwen3-4b-structeval-sft-merged
Model Description
This model is a merged SFT model created by combining a base model with a LoRA adapter trained for structured output generation tasks (StructEval-T style).
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Training method: Supervised Fine-Tuning (SFT) with QLoRA
- LoRA adapter source: powrin/qwen3_4b_sft_v_4ds_ep2_lr36
- Merge strategy:
merge_and_unload(LoRA weights merged into base model)
This repository contains the fully merged model, ready for inference or further preference optimization (e.g. DPO).
Training Data
The LoRA adapter was trained using a mixture of the following datasets (only officially permitted datasets were used):
- u-10bei/structured_data_with_cot_dataset_512_v4
- u-10bei/structured_data_with_cot_dataset_512_v5
- daichira/structured-3k-mix-sft (auxiliary)
- daichira/structured-5k-mix-sft (auxiliary)
No evaluation or test data (e.g. public benchmarks) were used during training.
Intended Use
- Structured output generation
- JSON / schema-constrained generation
- Research on structured reasoning models
- Further fine-tuning with preference optimization (DPO)
Limitations
- This model is optimized for format correctness, not for free-form reasoning.
- It may underperform on open-ended or creative tasks.
- Additional tuning may be required for downstream tasks.
Citation
If you use this model in your research, please cite the original base model and the dataset authors accordingly.
- Downloads last month
- 28
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for powrin/qwen3-4b-structeval-sft-merged
Base model
Qwen/Qwen3-4B-Instruct-2507