AI & ML interests
None yet
Organizations
None yet
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_cosine_0.5_0.5_True_50
Updated
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_variance_regularized_0.5_0.5_0.1_1600
Updated
citrinegui/Qwen2.5-1.5B-Instruct_countdown45_grpo_balanced_0.5_0.5_True_1600
Text Generation
•
2B
•
Updated
•
2
citrinegui/Qwen2.5-1.5B-Instruct_countdown345_grpo_balanced_0.5_0.5_True_1600
Text Generation
•
2B
•
Updated
•
4
citrinegui/Llama-3.2-3B-Instruct_countdown6_grpo_balanced_0.5_0.5_True_1600
Text Generation
•
3B
•
Updated
•
2
citrinegui/Llama-3.2-3B-Instruct_countdown5_grpo_balanced_0.5_0.5_True_1600
Text Generation
•
3B
•
Updated
•
2
citrinegui/Qwen2.5-3B-Instruct_countdown6_grpo_balanced_0.5_0.5_True_1600
Text Generation
•
3B
•
Updated
•
2
citrinegui/Qwen2.5-3B-Instruct_countdown5_grpo_balanced_0.5_0.5_True_1600
Text Generation
•
3B
•
Updated
•
1
citrinegui/Llama-3.2-3B-Instruct_countdown2345_grpo_gaussian_0.25_0.75_True_1600
Text Generation
•
3B
•
Updated
•
1
citrinegui/Llama-3.2-3B-Instruct_countdown2345_grpo_gaussian_0.75_0.25_True_1600
Text Generation
•
3B
•
Updated
•
1
citrinegui/Llama-3.2-3B-Instruct_countdown2345_grpo_balanced_0.5_0.5_True_1600
Text Generation
•
3B
•
Updated
•
6
citrinegui/Llama-3.2-3B-Instruct_countdown2345_grpo_cosine_0.5_0.5_True_1600
Text Generation
•
3B
•
Updated
•
2
citrinegui/Llama-3.2-3B-Instruct_countdown2345_grpo_classic_0.5_0.5_True_1600
Text Generation
•
3B
•
Updated
•
1
citrinegui/Llama-3.2-3B-Instruct_countdown2345_grpo_gaussian_0.5_0.5_True_1600
Text Generation
•
3B
•
Updated
•
1
citrinegui/Qwen2.5-3B-Instruct_countdown2345_grpo_gaussian_0.7_0.3_True_1600
Updated
citrinegui/Qwen2.5-3B-Instruct_countdown2345_grpo_gaussian_0.75_0.25_True_1600
Text Generation
•
3B
•
Updated
•
2
citrinegui/Qwen2.5-3B-Instruct_countdown2345_grpo_gaussian_0_25_0_75_True_1600
Text Generation
•
3B
•
Updated
•
2
citrinegui/Qwen2.5-3B-Instruct_countdown2345_grpo_gaussian_0_5_0_5_True_1600
Text Generation
•
3B
•
Updated
•
2
citrinegui/Qwen2.5-3B-Instruct_countdown2345_grpo_classic_0_5_0_5_True_1600
Text Generation
•
3B
•
Updated
•
2
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_gaussian_0_75_0_25_True_1600
Text Generation
•
2B
•
Updated
•
6
citrinegui/Qwen2.5-3B-Instruct_countdown2345_grpo_balanced_0_5_0_5_True_1600
Text Generation
•
3B
•
Updated
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_classic_0_5_0_5_True_1600
Text Generation
•
2B
•
Updated
•
2
citrinegui/Qwen2.5-3B-Instruct_countdown2345_grpo_cosine_0_5_0_5_True_1600
Text Generation
•
3B
•
Updated
•
2
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_gaussian_0_25_0_75_True_1600
Text Generation
•
2B
•
Updated
•
2
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_balanced1600
Updated
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_balanced_0.5_0.5_True_1600
Text Generation
•
2B
•
Updated
•
5
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_cosine_0.5_0.5_True_1600
Text Generation
•
2B
•
Updated
•
3
citrinegui/Qwen2.5-1.5B-Instruct_countdown6_grpo_balanced_0.5_0.5_True_1600
Text Generation
•
2B
•
Updated
•
3
citrinegui/Qwen2.5-1.5B-Instruct_countdown5_grpo_balanced_0.5_0.5_True_1600
Text Generation
•
2B
•
Updated
•
5
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_gaussian_0.5_0.5_True_1600
Text Generation
•
2B
•
Updated
•
1