Truong D Nguyen PRO

tonyshelby

1 1

·

AI & ML interests

None yet

Organizations

tonyshelby 's models 41

tonyshelby/llama-ATBPO-noisy

tonyshelby/llama-QTBPO-noisy

tonyshelby/llama-TBPO-no-weight

tonyshelby/mistral-ATBPO-merged

tonyshelby/llama-ATBPO-merged

tonyshelby/llama-ATBPO

tonyshelby/llama-reverse-dpo-merged

tonyshelby/mistral-reverse-dpo-merged

tonyshelby/llama-QTBPO-merged

tonyshelby/llama-QTBPO-first-run

tonyshelby/llama-sft-merged

tonyshelby/mistral-QTBPO-merged

tonyshelby/mistral-QTBPO-third-run

tonyshelby/mistral-sft-merged

Text Generation • 7B • Updated Jan 11 • 12

tonyshelby/mistral-QTBPO-second-run

tonyshelby/mistral-QTBPO-first-run

tonyshelby/mistralai-Mistral-7B-Instruct-v0.1-ufb-sft-lora

tonyshelby/qwen2.5_7b_checkpoints

Updated Dec 24, 2025

tonyshelby/qwen2.5_3b_checkpoints

Updated Dec 24, 2025

tonyshelby/Qwen-R1-1.5B-Code-V1-step-69-merged

Text Generation • 2B • Updated Dec 9, 2025 • 3

tonyshelby/Qwen-R1-1.5B-Code-V1

Updated Dec 9, 2025

tonyshelby/Qwen-R1-1.5B-test10

Updated Dec 9, 2025

tonyshelby/Qwen-R1-1.5B-test

Updated Dec 9, 2025

tonyshelby/model

Updated May 28, 2025

tonyshelby/llama1B_SFT_sample_log

Updated May 27, 2025

tonyshelby/Qwen2.5_1.5B_SFT_sample_log_step_4

Updated May 25, 2025

tonyshelby/Qwen2.5_0.5B_TDPO

Text Generation • 0.5B • Updated May 10, 2025 • 5

tonyshelby/Qwen2.5_0.5B_TDPO_step_8

Text Generation • 0.5B • Updated May 10, 2025 • 5

tonyshelby/Qwen2.5_0.5B_TDPO_step_4

Text Generation • 0.5B • Updated May 10, 2025 • 6

tonyshelby/Qwen2.5_0.5B_DPO_sample

Text Generation • 0.5B • Updated May 7, 2025 • 4