main_rev2_sft03

This is a Safe SFT LoRA adapter (REV2 SFT03). It uses Completion-only Training to ensure only the assistant's output is learned. It learns strictly for 1 epoch.

Base Model

Qwen/Qwen3-4B-Instruct-2507

Training Data (Mixed)

  • 75%: daichira/structured-hard-sft-4k (High Quality)
  • 25%: u-10bei/structured_data_with_cot_dataset_512_v4 (Filtered Output-only)

Method

  • Completion-only: User prompts are masked (-100 output label).
  • Marker: `

OUTPUT

` inserted before assistant output.

  • Filters: Enhanced Output-only extraction, Length Limit, Spam/Repetition Check.
  • Config: 1 Epoch, Max Seq Length 4096.
Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fieldvalley-llm2025/main_rev2_sft03

Adapter
(4077)
this model