-
DeepSeek R1 Chat Assistant Web Search
📚260DeepSeek R1 Chat Assistant Web Search
-
bigcode/starcoderbase
Text Generation • Updated • 56 • 416 -
deepseek-ai/DeepSeek-R1-Zero
Text Generation • Updated • 5.86k • 947 -
microsoft/Phi-4-multimodal-instruct
Automatic Speech Recognition • 6B • Updated • 300k • 1.58k
Collections
Discover the best community collections!
Collections trending this week
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 31 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 125 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
-
Sorihon/Wicked-Nebula-12B-Heretic
12B • Updated • 39 • 2 -
Vortex5/Wicked-Nebula-12B
Text Generation • 12B • Updated • 1.48k • 9 -
Vortex5/Azure-Starlight-12B
Text Generation • 12B • Updated • 64 • 7 -
DreadPoor/Famino-12B-Model_Stock
Text Generation • 12B • Updated • 43 • 23
-
DeepSeek R1 Chat Assistant Web Search
📚260DeepSeek R1 Chat Assistant Web Search
-
bigcode/starcoderbase
Text Generation • Updated • 56 • 416 -
deepseek-ai/DeepSeek-R1-Zero
Text Generation • Updated • 5.86k • 947 -
microsoft/Phi-4-multimodal-instruct
Automatic Speech Recognition • 6B • Updated • 300k • 1.58k
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 31 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 125 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
-
Sorihon/Wicked-Nebula-12B-Heretic
12B • Updated • 39 • 2 -
Vortex5/Wicked-Nebula-12B
Text Generation • 12B • Updated • 1.48k • 9 -
Vortex5/Azure-Starlight-12B
Text Generation • 12B • Updated • 64 • 7 -
DreadPoor/Famino-12B-Model_Stock
Text Generation • 12B • Updated • 43 • 23