AI & ML interests
None yet
Organizations
None yet
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND1
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND1-checkpoint-epoch-100
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND1-checkpoint-epoch-80
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND1-checkpoint-epoch-60
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND1-checkpoint-epoch-40
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND1-checkpoint-epoch-20
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND3-reward-2025-09-22_11-15-41
Updated
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND3
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND3-checkpoint-epoch-100
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND3-checkpoint-epoch-80
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND3-checkpoint-epoch-60
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND3-checkpoint-epoch-40
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND3-checkpoint-epoch-20
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND5-reward-2025-09-22_10-46-42
Updated
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND5
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND5-checkpoint-epoch-100
Reinforcement Learning
•
1B
•
Updated
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND5-checkpoint-epoch-80
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND5-checkpoint-epoch-60
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND5-checkpoint-epoch-40
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND5-checkpoint-epoch-20
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_RETRY_SAMPLING_scale10_Round3-reward-2025-09-22_09-55-45
Updated
MattBou00/llama-3-2-1b-detox_RETRY_SAMPLING_scale10_Round3
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_RETRY_SAMPLING_scale10_Round3-checkpoint-epoch-100
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_RETRY_SAMPLING_scale10_Round3-checkpoint-epoch-80
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_RETRY_SAMPLING_scale10_Round3-checkpoint-epoch-60
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_RETRY_SAMPLING_scale10_Round3-checkpoint-epoch-40
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_RETRY_SAMPLING_scale10_Round3-checkpoint-epoch-20
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_RETRY_scale10_Round1-reward-2025-09-19_14-40-16
Updated
MattBou00/llama-3-2-1b-detox_RETRY_scale10_Round1
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_RETRY_scale10_Round1-checkpoint-epoch-100
Reinforcement Learning
•
1B
•
Updated
•
1