·
AI & ML interests
LLMs
Organizations
None yet
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-o16-t-120step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-webshop-20step-v2-Llama-3.1-8B-Instruct-o16-nt-15step
8B • Updated ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-o16-t-105step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-o16-t-90step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-o16-t-75step
8B • Updated ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-o16-t-60step
8B • Updated ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-o16-t-45step
8B • Updated ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-o16-t-30step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-only16-nothink-f-150step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-o16-t-15step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-only16-nothink-f-135step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Llama-3.1-8B-Instruct-o16-t-150step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-only16-nothink-f-120step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Llama-3.1-8B-Instruct-o16-t-135step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-only16-nothink-f-105step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Llama-3.1-8B-Instruct-o16-t-120step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-only16-nothink-f-90step
8B • Updated ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Llama-3.1-8B-Instruct-o16-t-105step
8B • Updated ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-only16-nothink-f-75step
8B • Updated ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Llama-3.1-8B-Instruct-o16-t-90step
8B • Updated ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-only16-nothink-f-60step
8B • Updated ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Llama-3.1-8B-Instruct-o16-t-75step
8B • Updated ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-only16-nothink-f-45step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Llama-3.1-8B-Instruct-o16-t-60step
8B • Updated ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-only16-nothink-f-30step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Llama-3.1-8B-Instruct-o16-t-45step
8B • Updated ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Qwen2.5-7B-Instruct-only16-nothink-f-15step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Llama-3.1-8B-Instruct-o16-t-30step
8B • Updated ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Llama-3.1-8B-Instruct-o16-t-15step
8B • Updated • 1
ZHLiu627/verl_agent_webshop-new-GRPO-from-webshop-20step-v2-Llama-3.1-8B-Instruct-only16-nothink-f-90step
8B • Updated • 1