AI & ML interests
None yet
Organizations
None yet
kevinshin/qwen3-1.7b-rpo-lr-1e-4-alpha-1-beta-0.1-wc-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated • 1
kevinshin/qwen3-1.7b-rpo-lr-1e-6-alpha-0.1-beta-0.01-wc-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated • 1
kevinshin/qwen3-1.7b-rpo-lr-1e-5-alpha-1-beta-0.01-wc-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated kevinshin/qwen2.5-1.5b-rft-rpo-lr-1e-5-alpha-1-beta-0.01-wc-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated • 1
kevinshin/qwen2.5-1.5b-rft-rpo-lr-1e-5-alpha-0.1-beta-0.01-wc-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated • 3
kevinshin/qwen2.5-1.5b-rft-rpo-lr-1e-5-alpha-0.1-beta-0.1-wc-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated • 1
kevinshin/qwen3-1.7b-rpo-lr-1e-5-alpha-1-beta-0.1-wc-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated • 2
kevinshin/qwen3-1.7b-rpo-lr-1e-6-alpha-1-beta-0.01-wc-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated • 1
kevinshin/qwen3-1.7b-rpo-lr-1e-5-alpha-0.1-beta-0.1-wc-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated • 2
kevinshin/qwen3-1.7b-rpo-lr-1e-6-alpha-1-beta-0.1-wc-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated • 1
kevinshin/qwen3-1.7b-rpo-lr-1e-6-alpha-0.1-beta-0.1-wc-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated • 2
kevinshin/qwen3-1.7b-rpo-lr-1e-5-alpha-0.1-beta-0.01-wc-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated • 2
kevinshin/qwen3-1.7b-sft-wildchat-cw-3k-neg-rethink-pos-sft-rethink-pos
Text Generation
• 2B • Updated • 2
kevinshin/qwen2.5-1.5b-it-rft-sft-wildchat-cw-3k-neg-rethink-pos-sft-rethink-pos
Text Generation
• 2B • Updated • 1
kevinshin/qwen2.5-1.5b-it-rft-sft-wildchat-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated • 1
kevinshin/qwen3-1.7b-sft-wildchat-cw-3k-neg-rethink-pos
Text Generation
• 2B • Updated • 2
kevinshin/qwen2.5-1.5b-rft-rpo-beta-0.01-epoch-1-alpha-0.1-wc-cw-3k-rethink-pos
Text Generation
• 2B • Updated kevinshin/qwen2.5-1.5b-rft-rpo-beta-0.1-epoch-1-alpha-1-wc-cw-3k-rethink-pos
Text Generation
• 2B • Updated • 1
kevinshin/qwen2.5-1.5b-rft-rpo-beta-0.1-epoch-1-alpha-0.1-wc-cw-3k-rethink-pos
Text Generation
• 2B • Updated • 2
kevinshin/qwen3-1.7b-rpo-lr-1e-6-beta-0.1-epoch-1-alpha-0.1-wc-cw-3k-rethink-pos
Text Generation
• 2B • Updated • 2
kevinshin/qwen2.5-1.5b-rft-rpo-beta-0.01-epoch-1-alpha-1-wc-cw-3k-rethink-pos
Text Generation
• 2B • Updated • 1
kevinshin/err_qwen3-1.7b-rpo-lr-1e-6-beta-0.1-epoch-1-alpha-0.1-wc-cw-3k-rethink-pos
Text Generation
• 2B • Updated • 1
kevinshin/qwen3-1.7b-rpo-lr-1e-6-beta-0.01-epoch-1-alpha-1-wc-cw-3k-rethink-pos
Text Generation
• 2B • Updated kevinshin/qwen3-1.7b-rpo-lr-1e-6-beta-0.01-epoch-1-alpha-0.1-wc-cw-3k-rethink-pos
Text Generation
• 2B • Updated • 1
kevinshin/qwen3-1.7b-rpo-lr-1e-6-beta-0.1-epoch-1-alpha-1-wc-cw-3k-rethink-pos
Text Generation
• 2B • Updated • 2
kevinshin/qwen3-1.7b-critique-wildchat-cw-3k-rethink-pos
Text Generation
• 2B • Updated • 1
kevinshin/qwen2.5-1.5b-it-rft-critique-wildchat-cw-3k-rethink-pos
Text Generation
• 2B • Updated kevinshin/qwen3-1.7b-critique-lr-1e-5-batch-16-epoch-2-mask-neg-reas-neg-ans-wildchat-cw-3k-rethink
Text Generation
• 2B • Updated kevinshin/qwen3-1.7b-critique-lr-1e-5-batch-16-epoch-1-mask-neg-reas-neg-ans-wildchat-cw-3k-rethink
Text Generation
• 2B • Updated kevinshin/qwen3-1.7b-base-critique-lr-1e-5-batch-16-epoch-1-no-mask-wildchat-cw-3k
Text Generation
• 2B • Updated • 3