AIPlans/qwen3-0.6b-hh-rlhf-sft
0.6B • Updated • 3
AIPlans/Qwen3-0.6B-KTO_trial
Text Generation
• 0.6B • Updated • 6
• 1
AIPlans/qwen3-0.6b-sft-hh-rlhf-lora
Updated
AIPlans/qwen3-0.6b-base-PPO-PM
AIPlans/qwen3-0.6b-base-hl-RM
Text Classification
• 0.6B • Updated • 1
0.6B • Updated • 2
AIPlans/qwen3-0.6b-dpo-lora
Text Generation
• 0.6B • Updated • 5
• 1
AIPlans/qwen3-0.6B-reward-hh-rlhf
Text Generation
• 0.6B • Updated • 5
AIPlans/qwen3-8b-ipo-hh-rlhf
Text Generation
• Updated • 3
AIPlans/qwen3-8b-dpo-hh-rlhf
Updated
AIPlans/Qwen3-HHH-Cipher-Eng
Text Generation
• 0.6B • Updated • 34
• AIPlans/Qwen-HHH-Cipher-Eng
Text Generation
• 0.5B • Updated • 6
AIPlans/Qwen-HHH-Sans-Eng
Text Generation
• 0.5B • Updated • 19