·
AI & ML interests
None yet
Organizations
motigrez/codecontest_qwen2.5_72b_grpo_rollback_step100
motigrez/codecontest_llama3_70b_grpo
motigrez/codecontest_qwen2.5_72b_grpo
motigrez/codecontest_grpo_qwen2.5_32b_grpo_step80
motigrez/scienceworld_grpo_qwen2.5_7b_50_10_step50
motigrez/alfworld_qwen2.5_32b_regret_step300
motigrez/alfworld_qwen2.5_32b_regret_step150
motigrez/alfworld_qwen2.5_32b_vanilla_step100
motigrez/alfworld_qwen2.5_32b_vanilla_step50
motigrez/qwen2.5_72b_regret_global_step_200
motigrez/llama3.1_70b_regret_global_step_172
motigrez/qwen2.5_72b_grpo_global_step_114
motigrez/qwen2.5_72b_regret_global_step_100
motigrez/qwen2.5_32b_alfworld_il
motigrez/qwen2.5_14b_alfworld_il
motigrez/llama3.1_8b_alfworld_il
motigrez/qwen2.5_7b_alfworld_il
motigrez/qwen3_4b_step_reward_self_8k_300
4B • Updated motigrez/qwen3_4b_step_reward_self_5k_231
4B • Updated • 2
motigrez/qwen2.5_7b_sft_plan_step_36
8B • Updated motigrez/qwen3_4b_alfworld_step_reward_210
4B • Updated motigrez/qwen25_7b_step_reward_thinking_180
8B • Updated motigrez/qwen25_7b_step_reward_thinking_240
8B • Updated motigrez/qwen25_7b_step_reward_180
8B • Updated motigrez/qwen2_5_7b_plan_sft
motigrez/qwen2_5_7b_plan_state_temp0_nothink_180
8B • Updated • 1
motigrez/qwen2_5_7b_plan_state_temp0.7_nothink
8B • Updated • 1
motigrez/qwen2_5_7b_level2_easy_300
motigrez/qwen2_5_7b_level2_easy_240
8B • Updated motigrez/qwen2_5_7b_level1
8B • Updated