llm-sft-lora-intermediate20260215 (SFT Intermediate Adapter)
This is the SFT-only intermediate adapter, before DPO training. Use this as the starting point for DPO experiments.
Base Model
Qwen/Qwen3-4B-Instruct-2507
Usage
Set DPO_SFT_SOURCE="centmount/llm-sft-lora-intermediate20260215" and RUN_MODE="dpo_only" to run DPO from this checkpoint.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for centmount/llm-sft-lora-intermediate20260215
Base model
Qwen/Qwen3-4B-Instruct-2507