·
AI & ML interests
None yet
Organizations
AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-iterDPO-iter2-4k
Text Generation
• 841k • Updated • 1
AmberYifan/Qwen2.5-7B-Instruct-wildfeedback-iterDPO-iter2-4k
Text Generation
• 333k • Updated • 1
• 1
AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-iterDPO-iter1-4k
Text Generation
• 841k • Updated • 1
AmberYifan/Qwen2.5-7B-Instruct-wildfeedback-iterDPO-iter1-4k
Text Generation
• 333k • Updated • 1
AmberYifan/Qwen2.5-14B-Instruct-ultrafeedback-DRIFT-iter2-RPO
Text Generation
• 841k • Updated AmberYifan/Qwen2.5-14B-Instruct-ultrafeedback-spin-iter2-RPO
Text Generation
• 841k • Updated • 3
AmberYifan/Qwen2.5-14B-Instruct-ultrafeedback-iterdpo-iter2-RPO
Text Generation
• 841k • Updated • 4
• 1
AmberYifan/Qwen2.5-14B-Instruct-ultrafeedback-iterdpo-iter1-RPO
Text Generation
• 841k • Updated AmberYifan/Qwen2.5-14B-Instruct-ultrafeedback-spin-iter1-RPO
Text Generation
• 841k • Updated • 1
AmberYifan/Qwen2.5-14B-Instruct-ultrafeedback-drift-iter1-RPO
Text Generation
• 841k • Updated • 2
AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-DRIFT-iter2-4k
Text Generation
• 841k • Updated • 1
AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-DRIFT-iter1-4k
Text Generation
• 841k • Updated • 1
AmberYifan/Qwen2.5-7B-Instruct-wildfeedback-DRIFT-iter2-RPO
Text Generation
• 333k • Updated AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-DRIFT-iter2
Text Generation
• 841k • Updated AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-iterDPO-iter2
Text Generation
• 841k • Updated AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-SPIN-iter2
Text Generation
• 841k • Updated • 2
AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-iterDPO-iter1
Text Generation
• 841k • Updated • 1
AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-DRIFT-iter1
Text Generation
• 841k • Updated • 1
AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-SPIN-iter1
Text Generation
• 841k • Updated • 1
AmberYifan/Llama-3.1-8B-Instruct-wildfeedback-RPO-DRIFT-iter1
Text Generation
• 266k • Updated AmberYifan/Llama-3.1-8B-Instruct-wildfeedback-RPO-iterDPO-iter1
Text Generation
• 266k • Updated AmberYifan/Llama-3.1-8B-Instruct-wildfeedback-RPO-SPIN-iter1
Text Generation
• 266k • Updated • 1
AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-seed-RPO-0.001
Text Generation
• 841k • Updated • 2
AmberYifan/Llama-3.1-8B-Instruct-wildfeedback-seed-RPO-0.001
Text Generation
• 266k • Updated • 1
AmberYifan/Llama-3.1-8B-Instruct-wildfeedback-seed-RPO-1.5
Text Generation
• 266k • Updated AmberYifan/Llama-3.1-8B-Instruct-wildfeedback-seed-RPO-0.01
Text Generation
• 266k • Updated • 1
AmberYifan/Llama-3.1-8B-Instruct-wildfeedback-seed-RPO-0.5
Text Generation
• 266k • Updated AmberYifan/Llama-3.1-8B-Instruct-wildfeedback-seed-RPO
Text Generation
• 266k • Updated AmberYifan/Llama-3.1-8B-Instruct-wildfeedback-DRIFT-iter2-re
Text Generation
• 8B • Updated AmberYifan/Llama-3.1-8B-Instruct-wildfeedback-iterDPO-iter2
Text Generation
• 8B • Updated