·
AI & ML interests
None yet
Organizations
None yet
ajagota71/llama-3-2-1b-rlhf-kl-p5-target-2p5-lr-3e-6-checkpoint-epoch-60
Reinforcement Learning
• 1B • Updated • 3
ajagota71/llama-3-2-1b-rlhf-kl-p5-target-2p5-lr-3e-6-checkpoint-epoch-40
Reinforcement Learning
• 1B • Updated • 2
ajagota71/llama-3-2-1b-rlhf-kl-p5-target-2p5-lr-3e-6-checkpoint-epoch-20
Reinforcement Learning
• 1B • Updated • 2
ajagota71/llama-3-2-1b-rlhf-kl-p4-target-3
Reinforcement Learning
• 1B • Updated • 2
ajagota71/llama-3-2-1b-rlhf-kl-p4-target-3-checkpoint-epoch-100
Reinforcement Learning
• 1B • Updated • 2
ajagota71/llama-3-2-1b-rlhf-kl-p4-target-3-checkpoint-epoch-80
Reinforcement Learning
• 1B • Updated • 2
ajagota71/llama-3-2-1b-rlhf-kl-p4-target-3-checkpoint-epoch-60
Reinforcement Learning
• 1B • Updated • 2
ajagota71/llama-3-2-1b-rlhf-kl-p4-target-3-checkpoint-epoch-40
Reinforcement Learning
• 1B • Updated • 2
ajagota71/llama-3-2-1b-rlhf-kl-p4-target-3-checkpoint-epoch-20
Reinforcement Learning
• 1B • Updated • 1
ajagota71/llama-3-2-1b-s-nlp-detox-checkpoint-epoch-80
Reinforcement Learning
• 1B • Updated ajagota71/llama-3-2-1b-s-nlp-detox-checkpoint-epoch-60
Reinforcement Learning
• 1B • Updated • 1
ajagota71/llama-3-2-1b-s-nlp-detox-checkpoint-epoch-40
Reinforcement Learning
• 1B • Updated ajagota71/llama-3-2-1b-s-nlp-detox-checkpoint-epoch-20
Reinforcement Learning
• 1B • Updated • 1
ajagota71/pythia-1b-s-nlp-detox
Reinforcement Learning
• 1B • Updated • 4
ajagota71/pythia-1b-s-nlp-detox-checkpoint-epoch-100
Reinforcement Learning
• 1B • Updated • 1
ajagota71/pythia-1b-s-nlp-detox-checkpoint-epoch-80
Reinforcement Learning
• 1B • Updated ajagota71/pythia-1b-s-nlp-detox-checkpoint-epoch-60
Reinforcement Learning
• 1B • Updated ajagota71/pythia-1b-s-nlp-detox-checkpoint-epoch-40
Reinforcement Learning
• 1B • Updated • 1
ajagota71/pythia-1b-s-nlp-detox-checkpoint-epoch-20
Reinforcement Learning
• 1B • Updated • 1
ajagota71/pythia-410m-s-nlp-detox
Reinforcement Learning
• 0.4B • Updated ajagota71/pythia-410m-s-nlp-detox-checkpoint-epoch-100
Reinforcement Learning
• 0.4B • Updated ajagota71/pythia-410m-s-nlp-detox-checkpoint-epoch-80
Reinforcement Learning
• 0.4B • Updated • 2
ajagota71/pythia-410m-s-nlp-detox-checkpoint-epoch-60
Reinforcement Learning
• 0.4B • Updated • 3
ajagota71/pythia-410m-s-nlp-detox-checkpoint-epoch-40
Reinforcement Learning
• 0.4B • Updated ajagota71/pythia-410m-s-nlp-detox-checkpoint-epoch-20
Reinforcement Learning
• 0.4B • Updated ajagota71/toxicity-reward-model-max-ent-70m-s-nlp-test-500-samples-temp-p7-pythia-70m
70.4M • Updated ajagota71/toxicity-reward-model-max-ent-70m-s-nlp-test-500-samples-temp-p7-pythia-70m-checkpoint-30
70.4M • Updated ajagota71/toxicity-reward-model-max-ent-70m-s-nlp-test-500-samples-pythia-70m
70.4M • Updated ajagota71/toxicity-reward-model-max-ent-70m-s-nlp-test-500-samples-pythia-70m-checkpoint-30
70.4M • Updated ajagota71/pythia-70m-s-nlp-detox
Reinforcement Learning
• 70.4M • Updated