llm-sft-lora-intermediate20260215 (SFT Intermediate Adapter)

This is the SFT-only intermediate adapter, before DPO training. Use this as the starting point for DPO experiments.

Base Model

Qwen/Qwen3-4B-Instruct-2507

Usage

Set DPO_SFT_SOURCE="centmount/llm-sft-lora-intermediate20260215" and RUN_MODE="dpo_only" to run DPO from this checkpoint.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for centmount/llm-sft-lora-intermediate20260215

Finetuned
(931)
this model