-
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Paper • 2511.07384 • Published • 20 -
smcleish/Recurrent-Llama-3.2-train-recurrence-32
Text Generation • 1B • Updated • 125 • 1 -
smcleish/Recurrent-Llama-3.2-train-recurrence-16
Text Generation • 1B • Updated • 13 -
smcleish/Recurrent-Llama-3.2-train-recurrence-8
Text Generation • 1B • Updated • 52
Collections
Discover the best community collections!
Collections trending this week
-
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Paper • 2511.10645 • Published • 13 -
z-lab/Qwen3.6-27B-PARO
Image-Text-to-Text • 6B • Updated • 5.79k • 27 -
z-lab/Qwen3.6-35B-A3B-PARO
Image-Text-to-Text • 6B • Updated • 3.47k • 6 -
z-lab/gemma-4-31B-it-PARO
Image-Text-to-Text • 6B • Updated • 1.73k • 21
-
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Paper • 2511.07384 • Published • 20 -
smcleish/Recurrent-Llama-3.2-train-recurrence-32
Text Generation • 1B • Updated • 125 • 1 -
smcleish/Recurrent-Llama-3.2-train-recurrence-16
Text Generation • 1B • Updated • 13 -
smcleish/Recurrent-Llama-3.2-train-recurrence-8
Text Generation • 1B • Updated • 52
-
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Paper • 2511.10645 • Published • 13 -
z-lab/Qwen3.6-27B-PARO
Image-Text-to-Text • 6B • Updated • 5.79k • 27 -
z-lab/Qwen3.6-35B-A3B-PARO
Image-Text-to-Text • 6B • Updated • 3.47k • 6 -
z-lab/gemma-4-31B-it-PARO
Image-Text-to-Text • 6B • Updated • 1.73k • 21