·
AI & ML interests
None yet
Organizations
tonyshelby/llama-ATBPO-noisy
Updated
tonyshelby/llama-QTBPO-noisy
Updated
tonyshelby/llama-TBPO-no-weight
Updated
tonyshelby/mistral-ATBPO-merged
Updated
tonyshelby/llama-ATBPO-merged
Updated
tonyshelby/llama-reverse-dpo-merged
Updated
tonyshelby/mistral-reverse-dpo-merged
Updated
tonyshelby/llama-QTBPO-merged
Updated
tonyshelby/llama-QTBPO-first-run
Updated
tonyshelby/mistral-QTBPO-merged
Updated
tonyshelby/mistral-QTBPO-third-run
Updated
tonyshelby/mistral-QTBPO-second-run
Updated
tonyshelby/mistral-QTBPO-first-run
Updated
tonyshelby/mistralai-Mistral-7B-Instruct-v0.1-ufb-sft-lora
Updated
tonyshelby/qwen2.5_7b_checkpoints
Updated
tonyshelby/qwen2.5_3b_checkpoints
Updated
tonyshelby/Qwen-R1-1.5B-Code-V1-step-69-merged
Text Generation
• 2B • Updated • 1
tonyshelby/Qwen-R1-1.5B-Code-V1
Updated
tonyshelby/Qwen-R1-1.5B-test10
Updated
tonyshelby/Qwen-R1-1.5B-test
Updated
tonyshelby/llama1B_SFT_sample_log
Updated
tonyshelby/Qwen2.5_1.5B_SFT_sample_log_step_4
Updated
tonyshelby/Qwen2.5_0.5B_TDPO
Text Generation
• 0.5B • Updated • 2
tonyshelby/Qwen2.5_0.5B_TDPO_step_8
Text Generation
• 0.5B • Updated • 2
tonyshelby/Qwen2.5_0.5B_TDPO_step_4
Text Generation
• 0.5B • Updated tonyshelby/Qwen2.5_0.5B_DPO_sample
Text Generation
• 0.5B • Updated tonyshelby/Qwen2.5_0.5B_SFT_sample
Text Generation
• 0.5B • Updated tonyshelby/Llama_3.2_3B_gguf_final
3B • Updated • 6