Truong D Nguyen PRO
tonyshelby
·
AI & ML interests
None yet
Organizations
models 39
tonyshelby/llama-ATBPO-noisy
Updated
tonyshelby/llama-QTBPO-noisy
Updated
tonyshelby/llama-TBPO-no-weight
Updated
tonyshelby/mistral-ATBPO-merged
Updated
tonyshelby/llama-ATBPO-merged
Updated
tonyshelby/llama-ATBPO
Updated
tonyshelby/llama-reverse-dpo-merged
Updated
tonyshelby/mistral-reverse-dpo-merged
Updated
tonyshelby/llama-QTBPO-merged
Updated
tonyshelby/llama-QTBPO-first-run
Updated
datasets 36
tonyshelby/llama3-ultrafeedback-armorm-noisy-0.1
Viewer • Updated • 61.8k • 7
tonyshelby/llama3-ultrafeedback-armorm-noisy
Viewer • Updated • 61.8k • 24
tonyshelby/ultra-feedback-tisdpo-llama
Viewer • Updated • 63.1k • 25
tonyshelby/processed_data
Preview • Updated • 2
tonyshelby/ultra-feedback-tisdpo-mistral
Updated • 2
tonyshelby/ultrafeedback_binarized_reversed
Viewer • Updated • 187k • 6
tonyshelby/view
Viewer • Updated • 1.02k • 2
tonyshelby/AIME25-ER-verl
Viewer • Updated • 30 • 2
tonyshelby/Archer2.0-Math-1.5B-ER
Viewer • Updated • 70.8k • 8
tonyshelby/Archer2.0-Math-1.5B-VeRL
Viewer • Updated • 70.8k • 2