-
-
-
-
-
-
Inference Providers
Active filters:
ppo
MattBou00/smolLM-360m-detox_try_3_stable
Reinforcement Learning
•
0.4B
•
Updated
•
1
MattBou00/smolLM-360m-detox_try_3_stable_retry-ckpt-ep20-2025-08-18_18-34-45
Reinforcement Learning
•
0.4B
•
Updated
•
1
MattBou00/smolLM-360m-detox_try_3_stable_retry-ckpt-ep40-2025-08-18_18-34-45
Reinforcement Learning
•
0.4B
•
Updated
•
1
MattBou00/smolLM-360m-detox_try_3_stable_retry
Reinforcement Learning
•
0.4B
•
Updated
•
1
MattBou00/smolLM-360m-detox_try_4_closekl-ckpt-ep20-2025-08-18_18-50-03
Reinforcement Learning
•
0.4B
•
Updated
•
1
MattBou00/smolLM-360m-detox_try_4_closekl-ckpt-ep40-2025-08-18_18-50-03
Reinforcement Learning
•
0.4B
•
Updated
•
2
MattBou00/smolLM-360m-detox_try_4_closekl
Reinforcement Learning
•
0.4B
•
Updated
MattBou00/smolLM-135-detox_first
Reinforcement Learning
•
0.1B
•
Updated
•
1
MattBou00/smolLM-135m-detox_same_as_larger
Reinforcement Learning
•
0.1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1-checkpoint-epoch-20
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1b-checkpoint-epoch-20
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1b-checkpoint-epoch-40
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1b-checkpoint-epoch-60
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1c-checkpoint-epoch-20
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1c-checkpoint-epoch-40
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1c-checkpoint-epoch-60
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1d-checkpoint-epoch-20
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1d-checkpoint-epoch-40
Reinforcement Learning
•
1B
•
Updated
•
2
MattBou00/llama-3-2-1b-detox_v1d-checkpoint-epoch-60
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1e-checkpoint-epoch-20
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1e-checkpoint-epoch-40
Reinforcement Learning
•
1B
•
Updated
MattBou00/llama-3-2-1b-detox_v1e-checkpoint-epoch-60
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f-checkpoint-epoch-20
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f-checkpoint-epoch-40
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f-checkpoint-epoch-60
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f-checkpoint-epoch-80
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f-checkpoint-epoch-100
Reinforcement Learning
•
1B
•
Updated
•
3
MattBou00/llama-3-2-1b-detox_v1f
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_round1-checkpoint-epoch-20
Reinforcement Learning
•
1B
•
Updated
•
1
MattBou00/llama-3-2-1b-detox_v1f_round1-checkpoint-epoch-40
Reinforcement Learning
•
1B
•
Updated
•
1