exp040-soup-3model-weighted
Weighted Model Soup of 3 fine-tuned models for structured output generation (JSON / YAML / TOML / XML / CSV).
Full 16-bit merged weights. No adapter loading required.
Model Soup Configuration
This model is created by weighted averaging of 3 independently trained models:
| Weight | Model | Training | Score |
|---|---|---|---|
| 0.50 | tomofusa/exp017-dpo-ipo-merged | SFT + DPO (IPO, lr=5e-7) | 0.789 |
| 0.25 | tomofusa/exp020-simpo-merged | SFT + CPO/SimPO (beta=2.5) | 0.789 |
| 0.25 | tomofusa/exp034-toml-upsample-dpo-merged | SFT (TOML upsampled) + DPO (IPO) | 0.765 |
Soup method: model_A * 0.5 + model_B * 0.25 + model_C * 0.25 applied to all weight tensors.
Training Pipeline (per source model)
All source models share the same base pipeline:
- Base model: Qwen/Qwen3-4B-Instruct-2507
- SFT: QLoRA on structured output data (7,500 samples)
- SFT adapter: tomofusa/exp015-blend-h-lora
- Sources: daichira/structured-5k-mix-sft (5,000) + daichira/structured-hard-sft-4k (2,000 sampled) + custom TOML data (500)
- lr=5e-6, epochs=2, LoRA r=64/alpha=128, max_seq_len=1024
- DPO: IPO/SimPO on u-10bei/dpo-dataset-qwen-cot (4,040 samples)
- lr=5e-7, beta=0.1, epochs=1, LoRA r=64/alpha=128
- Merge + Soup: Each model merged to 16-bit, then weighted-averaged
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "tomofusa/exp040-soup-3model-weighted"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto",
)
Sources & Terms
- Base model: Qwen/Qwen3-4B-Instruct-2507 - Apache 2.0
- SFT data: daichira/structured-5k-mix-sft (CC-BY-4.0), daichira/structured-hard-sft-4k (CC-BY-4.0)
- DPO data: u-10bei/dpo-dataset-qwen-cot
- Users must comply with all upstream licenses and terms of use.
- Downloads last month
- 20
Model tree for tomofusa/exp040-soup-3model-weighted
Base model
Qwen/Qwen3-4B-Instruct-2507