Edit Models filters

Models

3,285

Base only

Active filters: ppo

MattBou00/llama-3-2-1b-detox_v1d-checkpoint-epoch-60

Reinforcement Learning • 1B • Updated Aug 19, 2025

MattBou00/llama-3-2-1b-detox_v1e-checkpoint-epoch-20

Reinforcement Learning • 1B • Updated Aug 20, 2025

MattBou00/llama-3-2-1b-detox_v1e-checkpoint-epoch-40

Reinforcement Learning • 1B • Updated Aug 20, 2025

MattBou00/llama-3-2-1b-detox_v1e-checkpoint-epoch-60

Reinforcement Learning • 1B • Updated Aug 20, 2025

MattBou00/llama-3-2-1b-detox_v1f-checkpoint-epoch-20

Reinforcement Learning • 1B • Updated Sep 18, 2025 • 1

MattBou00/llama-3-2-1b-detox_v1f-checkpoint-epoch-40

Reinforcement Learning • 1B • Updated Sep 18, 2025

MattBou00/llama-3-2-1b-detox_v1f-checkpoint-epoch-60

Reinforcement Learning • 1B • Updated Sep 18, 2025 • 1

MattBou00/llama-3-2-1b-detox_v1f-checkpoint-epoch-80

Reinforcement Learning • 1B • Updated Aug 20, 2025 • 1

MattBou00/llama-3-2-1b-detox_v1f-checkpoint-epoch-100

Reinforcement Learning • 1B • Updated Aug 20, 2025

MattBou00/llama-3-2-1b-detox_v1f

Reinforcement Learning • 1B • Updated Aug 20, 2025

MattBou00/llama-3-2-1b-detox_v1f_round1-checkpoint-epoch-20

Reinforcement Learning • 1B • Updated Aug 20, 2025

MattBou00/llama-3-2-1b-detox_v1f_round1-checkpoint-epoch-40

Reinforcement Learning • 1B • Updated Aug 20, 2025

MattBou00/llama-3-2-1b-detox_v1f_round1-checkpoint-epoch-60

Reinforcement Learning • 1B • Updated Aug 20, 2025 • 1

MattBou00/llama-3-2-1b-detox_v1f_round1-checkpoint-epoch-80

Reinforcement Learning • 1B • Updated Aug 20, 2025

MattBou00/llama-3-2-1b-detox_v1f_round1-checkpoint-epoch-100

Reinforcement Learning • 1B • Updated Aug 20, 2025

MattBou00/llama-3-2-1b-detox_v1f_round1

Reinforcement Learning • 1B • Updated Aug 20, 2025

nardit/LunarLander-v2

Reinforcement Learning • Updated Aug 20, 2025

jmartin233/ppo-LunarLander-v2-unit8

Reinforcement Learning • Updated Aug 20, 2025

PrParadoxy/ppo-CartPole-v1

Reinforcement Learning • Updated Aug 21, 2025

jhu-clsp/mmBERT-small

Fill-Mask • Updated Oct 17, 2025 • 41.6k • • 78

MattBou00/llama-3-2-1b-detox_v1f_round2-checkpoint-epoch-20

Reinforcement Learning • 1B • Updated Aug 21, 2025 • 2

MattBou00/llama-3-2-1b-detox_v1f_round2-checkpoint-epoch-40

Reinforcement Learning • 1B • Updated Aug 21, 2025 • 5

MattBou00/llama-3-2-1b-detox_v1f_round2-checkpoint-epoch-60

Reinforcement Learning • 1B • Updated Aug 21, 2025

MattBou00/llama-3-2-1b-detox_v1f_round2-checkpoint-epoch-80

Reinforcement Learning • 1B • Updated Aug 21, 2025

MattBou00/llama-3-2-1b-detox_v1f_round2-checkpoint-epoch-100

Reinforcement Learning • 1B • Updated Aug 21, 2025 • 1

MattBou00/llama-3-2-1b-detox_v1f_round2

Reinforcement Learning • 1B • Updated Aug 21, 2025

MattBou00/llama-3-2-1b-detox_v1f_round3-checkpoint-epoch-20

Reinforcement Learning • 1B • Updated Aug 21, 2025

MattBou00/llama-3-2-1b-detox_v1f_round3-checkpoint-epoch-40

Reinforcement Learning • 1B • Updated Aug 21, 2025

MattBou00/llama-3-2-1b-detox_v1f_round3-checkpoint-epoch-60

Reinforcement Learning • 1B • Updated Aug 21, 2025

MattBou00/llama-3-2-1b-detox_v1f_round3-checkpoint-epoch-80

Reinforcement Learning • 1B • Updated Aug 21, 2025 • 1