SDPO under Continual Learning
Meng Wang
Moenupa
AI & ML interests
MLLM Post-Training & Alignment
Recent Activity
updated a model about 1 hour ago
Moenupa/Qwen3-4B-Thinking-2507-SDPO0-MathChemToolCode updated a collection about 3 hours ago
SDPO@CL updated a collection about 3 hours ago
SDPO@CLOrganizations
models 46
Moenupa/Qwen3-4B-Thinking-2507-SDPO0-MathChemToolCode
Updated
Moenupa/Qwen3-4B-Thinking-2507-SDPO1restart-Math
Updated
Moenupa/Qwen3-4B-Thinking-2507-SDPO5restart-Math
Updated
Moenupa/Qwen3-4B-Thinking-2507-SDPO0restart-Math
Updated
Moenupa/Qwen3-4B-Thinking-2507-SDPO0-MathChemTool
Updated
Moenupa/Qwen3-4B-Thinking-2507-SDPO0-MathChem
Updated
Moenupa/Qwen3-4B-Thinking-2507-SDPO5-MathChem
Updated
Moenupa/Qwen3-4B-Thinking-2507-SDPO5-MathChemToolCode
Updated
Moenupa/Qwen3-4B-Thinking-2507-SDPO5-MathChemTool
Updated
Moenupa/Qwen3-4B-Thinking-2507-GRPO-MathChemTool
Updated
datasets 10
Moenupa/Dolci-Think-RL-7B
Viewer • Updated • 72.2k • 32
Moenupa/verl
Viewer • Updated • 216k • 29
Moenupa/DAIR
Viewer • Updated • 361k • 401
Moenupa/Vision-Flan-191-1k
Viewer • Updated • 186k • 339
Moenupa/MSVQA
Viewer • Updated • 39.2k • 181
Moenupa/Domain8k-ICL
Viewer • Updated • 37.8k • 335
Moenupa/MC-Bench-VQA-ICL
Viewer • Updated • 1k • 23
Moenupa/MC-Bench-VQA
Viewer • Updated • 2k • 31
Moenupa/MC-Bench
Viewer • Updated • 2k • 39
Moenupa/MemoryBench-Full
Viewer • Updated • 8.99k • 133 • 1