·
AI & ML interests
None yet
Organizations
Pamela153/PPO-Qwen1.5B-tw-tiny-w2-o3-q4-iter12-param-9
Updated
Pamela153/PPO-Qwen1.5B-tw-tiny-w2-o3-q4-iter12-param-6
Updated
Pamela153/PPO-Qwen1.5B-tw-tiny-w2-o3-q4-iter12-param-7
Updated
Pamela153/PPO-Qwen1.5B-tw-tiny-w2-o3-q4-iter12-param-2
Updated
Pamela153/PPO-Qwen7B-alfworld-resume-textworld-w4-o6-q8-iter24
Updated
Pamela153/PPO-Qwen7B-alfworld-iter24-sft-30-data-final
Updated
Pamela153/PPO-Qwen7B-tw-small-w4-o6-q8-iter24-final
Updated
Pamela153/PPO-Qwen7B-textworld-w4-o6-q8-resume-alfworld-iter24
Updated
Pamela153/PPO-Qwen1.5B-tw-tiny-w2-o3-q4-iter12-temp1.0-param-1
Updated
Pamela153/PPO-Qwen1.5B-tw-tiny-w2-o3-q4-iter12-param-5
Updated
Pamela153/rloo-Qwen7B-tw-simple-sparse-iter24
Updated
Pamela153/PPO-Qwen7B-alfworld-iter16-sft-100-data
Updated
Pamela153/rloo-Qwen7B-tw-simple-balanced-iter24
Updated
Pamela153/PPO-Qwen7B-tw-simple-dense-iter24-final-4
Pamela153/PPO-Qwen7B-tw-simple-dense-iter24-final-2
Updated
Pamela153/PPO-Qwen7B-tw-simple-sparse-iter24-final
Updated
Pamela153/PPO-Qwen1.5B-tiny-mixed-iter16-param-1
Updated
Pamela153/PPO-Qwen1.5B-tw-tiny-w2-o3-q4-iter6-param-1
Updated
Pamela153/PPO-Qwen1.5B-tw-tiny-w2-o3-q4-iter8-param-1
Updated
Pamela153/PPO-Qwen1.5B-tw-tiny-w2-o3-q4-iter16-param-1
Updated
Pamela153/rloo-Qwen7B-tw-simple-dense-iter24
Updated
Pamela153/PPO-Qwen1.5B-tw-tiny-both-w8-o12-q4-iter16-param-1
Updated
Pamela153/SFT-Qwen7B-alfworld-100-data
Updated
Pamela153/PPO-Qwen7B-tw-simple-dense-iter24-final-3
Updated
Pamela153/Qwen1.5B-tw-tiny-rooms-w8-o3-q4-iter16-seed-1
Updated
Pamela153/PPO-Qwen7B-tw-small-w4-o6-q8-iter24-default
Updated
Pamela153/rloo-Qwen1.5B-tw-small-w4-o6-q8-iter24-param-3
Updated