ajagota71/toxicity-reward-model-v-head-max-margin-seed-42-pythia-410m-checkpoint-70 0.4B • Updated May 13, 2025
ajagota71/toxicity-reward-model-v-head-max-margin-seed-400-pythia-160m-checkpoint-50 0.2B • Updated May 13, 2025
ajagota71/toxicity-reward-model-v-head-max-margin-seed-300-pythia-160m-checkpoint-50 0.2B • Updated May 13, 2025
ajagota71/toxicity-reward-model-v-head-max-margin-seed-200-pythia-160m 0.2B • Updated May 13, 2025 • 1
ajagota71/toxicity-reward-model-v-head-max-margin-seed-200-pythia-160m-checkpoint-50 0.2B • Updated May 13, 2025
ajagota71/toxicity-reward-model-v-head-max-margin-seed-100-pythia-160m-checkpoint-50 0.2B • Updated May 13, 2025 • 1
ajagota71/toxicity-reward-model-v-head-max-margin-seed-42-pythia-160m 0.2B • Updated May 13, 2025 • 1
ajagota71/toxicity-reward-model-v-head-max-margin-seed-42-pythia-160m-checkpoint-50 0.2B • Updated May 13, 2025
ajagota71/toxicity-reward-model-v-head-max-margin-seed-400-pythia-70m-checkpoint-30 70.4M • Updated May 13, 2025
ajagota71/toxicity-reward-model-v-head-max-margin-seed-300-pythia-70m-checkpoint-30 70.4M • Updated May 13, 2025
ajagota71/toxicity-reward-model-v-head-max-margin-seed-200-pythia-70m-checkpoint-30 70.4M • Updated May 13, 2025
ajagota71/toxicity-reward-model-v-head-max-margin-seed-100-pythia-70m-checkpoint-30 70.4M • Updated May 13, 2025
ajagota71/toxicity-reward-model-v-head-max-margin-seed-42-pythia-70m-checkpoint-30 70.4M • Updated May 13, 2025
ajagota71/70m-toxicity-reward-model-max-margin-epoch-100-test-hub-seed-100-v-head-pythia-70m 70.4M • Updated May 13, 2025 • 1
ajagota71/70m-toxicity-reward-model-max-margin-epoch-100-test-hub-seed-100-v-head-pythia-70m-checkpoint-10 70.4M • Updated May 13, 2025
ajagota71/70m-toxicity-reward-model-max-margin-epoch-100-test-hub-v-head-pythia-70m 70.4M • Updated May 13, 2025
ajagota71/70m-toxicity-reward-model-max-margin-epoch-100-test-hub-v-head-pythia-70m-checkpoint-10 70.4M • Updated May 13, 2025
ajagota71/pythia-410m-detox-irl-rlhf-seed-400 Reinforcement Learning • 0.4B • Updated May 12, 2025 • 1
ajagota71/pythia-410m-detox-irl-rlhf-seed-300 Reinforcement Learning • 0.4B • Updated May 12, 2025 • 1
ajagota71/pythia-410m-detox-irl-rlhf-seed-200 Reinforcement Learning • 0.4B • Updated May 11, 2025 • 1
ajagota71/pythia-410m-detox-irl-rlhf-seed-100 Reinforcement Learning • 0.4B • Updated May 11, 2025 • 1