Predict human preference to LLM responses.
Binfeng Xu
billxbf
AI & ML interests
evolving back to apes
Recent Activity
updated a model about 15 hours ago
billxbf/qwen3.5-4b-opencode-polar published a model about 15 hours ago
billxbf/qwen3.5-4b-opencode-polar updated a model 2 days ago
billxbf/qwen3.5-4b-qwencode-polarOrganizations
models 20
billxbf/qwen3.5-4b-opencode-polar
4B • Updated
billxbf/qwen3.5-4b-qwencode-polar
4B • Updated • 109
billxbf/qwen3.5-4b-claudecode-polar
4B • Updated • 13
billxbf/qwen3.5-4b-codex-polar-step72
Reinforcement Learning • 5B • Updated • 23
billxbf/zephyr-7b-dpo-iter1
Text Generation • 274k • Updated • 2
billxbf/zephyr-7b-dpo-iter3
Text Generation • 266k • Updated • 4
billxbf/zephyr-7b-dpo-iter2
Text Generation • 266k • Updated • 1
billxbf/Nano-Raccoon-Preview-1104
425k • Updated • 2
billxbf/zephyr-7b-sft-iter3
Text Generation • 266k • Updated • 3
billxbf/zephyr-7b-sft-iter2
Text Generation • 266k • Updated • 2
datasets 20
billxbf/math_pile_v3
Viewer • Updated • 1.52M • 86
billxbf/ultrafeedback-dpo-iter3
Viewer • Updated • 20.4k • 16
billxbf/ultrafeedback-dpo-iter1
Viewer • Updated • 20.4k • 2
billxbf/ultrafeedback-dpo-iter2
Viewer • Updated • 20.4k • 1
billxbf/ultrafeedback-sft-iter3
Viewer • Updated • 20.4k • 4
billxbf/ultrafeedback-sft-iter2
Viewer • Updated • 20.4k • 2
billxbf/ultrafeedback-sft-iter1
Viewer • Updated • 20.4k • 6
billxbf/verified100-chitchat
Viewer • Updated • 100 • 8
billxbf/verified100-lite
Viewer • Updated • 100 • 13
billxbf/verified100
Viewer • Updated • 100 • 4