Collections
Discover the best community collections!
Collections trending this week
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 31 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 124 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
-
Vortex5/Crimson-Constellation-12B
Text Generation • 12B • Updated • 82 • 6 -
Vortex5/Azure-Starlight-12B
Text Generation • 12B • Updated • 87 • 6 -
DreadPoor/Famino-12B-Model_Stock
Text Generation • 12B • Updated • 69 • 22 -
PocketDoc/Dans-PersonalityEngine-V1.3.0-12b
Text Generation • 12B • Updated • 156 • • 39
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 31 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 124 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
-
Vortex5/Crimson-Constellation-12B
Text Generation • 12B • Updated • 82 • 6 -
Vortex5/Azure-Starlight-12B
Text Generation • 12B • Updated • 87 • 6 -
DreadPoor/Famino-12B-Model_Stock
Text Generation • 12B • Updated • 69 • 22 -
PocketDoc/Dans-PersonalityEngine-V1.3.0-12b
Text Generation • 12B • Updated • 156 • • 39