qwen3-4b-structeval-strategy2-revised-merged

This is a merged model fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using the LoRA adapter from yuk1chan/qwen3-4b-structeval-yamlxml-boost-v2-lr6e-6.

🎯 Purpose

This merged model is designed for structured output generation (JSON / YAML / XML / TOML / CSV) and achieves 0.82286 StructEval score.

It serves as the base model for Stage 1: YAML/XML Specialized SFT.

πŸ”₯ Key Features: Max Seq Len = 1024

Unlike typical fine-tuning that uses max_seq_length=512, this model was trained with max_seq_length=1024 to:

  1. Process longer sequences: Handle complex YAML/XML documents that exceed 512 tokens
  2. Improve long-context accuracy: Better performance on deeply nested structures
  3. Enhance parsing capabilities: More robust handling of long-formatted data

πŸ“Š Training Results (Strategy 2 Revised)

Format Score Failures
JSON 100.0% (50/50) 0
YAML 97.1% (34/35) 1
XML 90.0% (18/20) 2
TOML 76.0% (19/25) 6
CSV 95.0% (19/20) 1
Overall 0.82286 (141/150) 9

βš™οΈ Training Configuration

Base Model

  • Model: Qwen/Qwen3-4B-Instruct-2507
  • Parameters: 4.05B (4,055,498,240 total)

LoRA Adapter

  • Adapter: yuk1chan/qwen3-4b-structeval-yamlxml-boost-v2-lr6e-6
  • Method: QLoRA (4-bit)
  • Max sequence length: 1024 πŸ”₯
  • Epochs: 1
  • Learning rate: 6e-6
  • LoRA: r=16, alpha=32
  • Training time: ~12 hours (T4 GPU)

Data Pipeline

  • Dataset: structeval_runB_yamlxml_boost_v2.jsonl (25,000 samples)
  • Cleaning: CoT removal, code fence removal, leading phrase removal
  • u-10beiη³»: γ€ŒOutput:γ€ζŠ½ε‡Ί
  • daichiraη³»: 2xγƒ–γƒΌγ‚Ήγƒˆ
  • YAML/XML: 2xγƒ–γƒΌγ‚Ήγƒˆ

Format Distribution (Estimated)

  • YAML: ~45% (11,250 samples)
  • XML: ~20% (5,000 samples)
  • JSON: ~15% (3,750 samples)
  • TOML: ~10% (2,500 samples)
  • CSV: ~10% (2,500 samples)

πŸ’» Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "yuk1chan/qwen3-4b-structeval-strategy2-revised-merged"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

# Inference
prompt = "Generate YAML code for..."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)

πŸ“š Training Strategy

Strategy 2 Revised: YAML/XML Clean Expansion

This model incorporates three key improvements:

1. u-10beiη³»: γ€ŒOutput:γ€ζŠ½ε‡Ί
  - Extract only content after "Output:" marker
  - Removes explanation-before-output tendency
2. daichiraη³»: 2ε€γƒ–γƒΌγ‚Ήγƒˆ
  - Boosts "Return ONLY" pattern examples
  - Strengthens direct output without explanation
3. YAML/XML: 2ε€γƒ–γƒΌγ‚Ήγƒˆ
  - Doubles YAML/XML training data
  - Improves weak format performance

🎯 Expected Improvements from Max Seq Len = 1024

Compared to max_seq_length=512 models:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚      Aspect      β”‚ Max Seq Len = 512 β”‚ Max Seq Len = 1024 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Long sequences   β”‚ ❌ Truncated      β”‚ βœ… Fully processed β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Complex YAML/XML β”‚ ⚠️ Partial        β”‚ βœ… Complete        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Deep nesting     β”‚ ⚠️ Limited        β”‚ βœ… Better          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Training time    β”‚ 6-8 hours         β”‚ 12 hours           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
πŸ”¬ Next Steps

This merged model will be used as the base for:

1. Stage 1: YAML/XML Specialized SFT
  - Further enhance YAML/XML to 99-100%
  - Exclude TOML from training data
  - Target: YAML 99%+, XML 96%+
2. Stage 2: TOML Refinement with hard4k
  - Use only daichira/structured-hard-sft-4k (TOML 100%)
  - Low learning rate (3e-6) to preserve base capabilities
  - Target: TOML 90%+

πŸ“Š Validation

- Training Loss: ~1.10 β†’ 0.83 (converged well)
- Validation Loss: ~2.04 β†’ 1.32 (converged well)
- All-masked samples: 0% after filtering
- Valid ratio: ~0.60-0.80 (healthy distribution)

βš–οΈ License

Apache 2.0

πŸ“ Citation

If you use this model, please cite:

@misc{{qwen3-4b-structeval-strategy2-revised-merged,
  title={{Qwen3-4B StructEval Strategy 2 Revised (Merged)}},
  author={{yuk1chan}},
  year={{2026}},
  url={{https://huggingface.co/yuk1chan/qwen3-4b-structeval-strategy2-revised-merged}},
}}

---
Trained with passion over 12 hours using Max Seq Len 1024! πŸ”₯

Base Model: Qwen/Qwen3-4B-Instruct-2507
LoRA Adapter: yuk1chan/qwen3-4b-structeval-yamlxml-boost-v2-lr6e-6
Downloads last month
14
Safetensors
Model size
4B params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for yuk1chan/qwen3-4b-structeval-strategy2-revised-merged

Adapter
(3812)
this model