ajagota71/SmolLM-360M-detox-checkpoint-epoch-100 Reinforcement Learning • 0.4B • Updated Aug 15, 2025 • 1
ajagota71/SmolLM-135M-detox-checkpoint-epoch-100 Reinforcement Learning • 0.1B • Updated Aug 14, 2025 • 1
ajagota71/SmolLM-360M-detox-checkpoint-epoch-80 Reinforcement Learning • 0.4B • Updated Aug 14, 2025 • 1
ajagota71/SmolLM-135M-detox-checkpoint-epoch-80 Reinforcement Learning • 0.1B • Updated Aug 14, 2025 • 1
ajagota71/SmolLM-360M-detox-checkpoint-epoch-60 Reinforcement Learning • 0.4B • Updated Aug 14, 2025 • 1
ajagota71/SmolLM-360M-detox-checkpoint-epoch-40 Reinforcement Learning • 0.4B • Updated Aug 14, 2025 • 1
ajagota71/SmolLM-135M-detox-checkpoint-epoch-40 Reinforcement Learning • 0.1B • Updated Aug 14, 2025 • 1
ajagota71/SmolLM-135M-detox-checkpoint-epoch-20 Reinforcement Learning • 0.1B • Updated Aug 14, 2025 • 1
ajagota71/toxicity-reward-model-v-head-prompt-output-max-margin-seed-42-pythia-1b-final 1B • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-v-head-prompt-output-max-margin-seed-42-pythia-1b-checkpoint-40 1B • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-v-head-prompt-output-max-margin-seed-42-pythia-1b-checkpoint-39 1B • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-v-head-prompt-output-max-margin-seed-42-pythia-1b-checkpoint-34 1B • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-v-head-prompt-output-max-margin-seed-42-pythia-1b-checkpoint-29 1B • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-v-head-prompt-output-max-margin-seed-42-pythia-1b-checkpoint-24 1B • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-p8-v-head-prompt-output-max-margin-seed-42-llama-3.2-1b-final 1B • Updated Aug 3, 2025 • 2
ajagota71/toxicity-reward-model-p8-v-head-prompt-output-max-margin-seed-42-llama-3.2-1b-checkpoint-40 1B • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-p8-v-head-prompt-output-max-margin-seed-42-llama-3.2-1b-checkpoint-39 1B • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-v-head-prompt-output-max-margin-seed-42-pythia-1b-checkpoint-19 1B • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-p8-v-head-prompt-output-max-margin-seed-42-llama-3.2-1b-checkpoint-34 1B • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-v-head-prompt-output-max-margin-seed-42-pythia-1b-checkpoint-14 1B • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-p8-v-head-output-max-margin-seed-42-pythia-70m-final 70.4M • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-p8-v-head-output-max-margin-seed-42-pythia-70m-checkpoint-40 70.4M • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-p8-v-head-output-max-margin-seed-42-pythia-70m-checkpoint-39 70.4M • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-p8-v-head-output-max-margin-seed-42-pythia-70m-checkpoint-34 70.4M • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-p8-v-head-output-max-margin-seed-42-pythia-70m-checkpoint-29 70.4M • Updated Aug 3, 2025
ajagota71/toxicity-reward-model-p8-v-head-output-max-margin-seed-42-pythia-70m-checkpoint-24 70.4M • Updated Aug 3, 2025