·
AI & ML interests
None yet
Organizations
alesiaivanova/Qwen-3b-GRPO-1-sub-2-sub-compute_tradeoff_100-v2
Updated
alesiaivanova/Qwen-7B-GRPO-math-1-sub-1024-lr-2e-6-2-sub-1536-lr-1e-6-3-sub-1536-lr-2e-6-4-sub-1536-lr-2e-6
Text Generation
• 8B • Updated
• 2
alesiaivanova/Qwen-3b-GRPO-1-sub-2-sub-3-sub-compute_tradeoff_50_25
Updated
alesiaivanova/Qwen-3b-GRPO-1-sub-2-sub-3-sub-compute_tradeoff_50_130_25
Updated
alesiaivanova/Llama-3B-GRPO-new-1-sub-main-2-sub-1024-v2
Updated
alesiaivanova/Llama-3B-GRPO-new-1-sub-main-2-sub-1024-3-sub-1536
Updated
alesiaivanova/Llama-3B-GRPO-new-1-sub-main-2-sub-1024-3-sub-1536-lr-3e-6
Updated
alesiaivanova/Llama-3B-GRPO-new-1-sub-main-2-sub-1024-3-sub-1536-lr-2e-6
Updated
alesiaivanova/Llama-3B-GRPO-new-1-sub-main-2-sub-1024-3-sub-1280-lr-2e-6
Updated
alesiaivanova/Llama-3B-GRPO-new-1-sub-main-2-sub-1024-3-sub-1024
Updated
alesiaivanova/Llama-3B-GRPO-new-1-sub-main-2-sub-1024-3-sub-1024-lr-2e-6
Updated
alesiaivanova/Llama-3B-GRPO-new-1-sub-2-sub-1024-v2
Updated
alesiaivanova/math_training
Updated
alesiaivanova/checkpoints
Updated
alesiaivanova/__pycache__
Updated
alesiaivanova/Qwen-7B-GRPO-math-1-sub-1024-lr-2e-6-2-sub-1536-lr-1e-6-3-sub-1536-lr-2e-6-4-sub-2048-lr-2e-6
Text Generation
• 8B • Updated
• 2
alesiaivanova/Qwen-7B-GRPO-math-1-sub-1024-16-gen-lr-2e-6-2-sub-1024-16-gen-lr-2e-6-v7
Text Generation
• 8B • Updated
• 2
alesiaivanova/Qwen-7B-GRPO-math-1-sub-1024-16-gen-lr-2e-6-2-sub-1024-16-gen-lr-2e-6-v8
Text Generation
• 8B • Updated
• 2
alesiaivanova/Qwen-7B-GRPO-math-1-sub-1024-16-gen-lr-2e-6-2-sub-1024-16-gen-lr-2e-6-v9
Text Generation
• 8B • Updated
• 2
alesiaivanova/Qwen-7B-GRPO-math-1-sub-1024-16-gen-lr-2e-6-2-sub-1024-16-gen-lr-2e-6-v10
Text Generation
• 8B • Updated
• 2
alesiaivanova/Qwen-3b-GRPO-1-subproblem-2-subproblems-stacked-v1
Updated
alesiaivanova/Qwen-3b-GRPO-1-sub-2-sub-long-fixed
Updated
alesiaivanova/Qwen-3b-GRPO-1-sub-2-sub-3-sub-long-fixed
Updated
alesiaivanova/Qwen-3b-GRPO-1-sub-2-sub-3-sub-4-sub-long-fixed
Updated
alesiaivanova/Qwen-3b-GRPO-1-sub-2-sub-3-sub-4-sub-5-sub-long-fixed
Updated
alesiaivanova/Qwen-3b-GRPO-1-sub-2-sub-3-sub-4-sub-16-gen-long-v1-5-sub-tuning-lr-5e-6
Updated
alesiaivanova/Qwen-3b-GRPO-1-sub-2-sub-3-sub-16-gen-long-v1-4-sub-tuning-lr-5e-6
Updated