deepseek-ai/DeepSeek-V4-Flash Text Generation β’ 158B β’ Updated about 8 hours ago β’ 669k β’ β’ 958
deepseek-ai/DeepSeek-V4-Pro Text Generation β’ 862B β’ Updated about 8 hours ago β’ 787k β’ β’ 3.62k
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate Paper β’ 2504.19874 β’ Published Apr 28, 2025 β’ 34
H-Neurons: On the Existence, Impact, and Origin of Hallucination-Associated Neurons in LLMs Paper β’ 2512.01797 β’ Published Dec 1, 2025 β’ 9 β’ 1
H-Neurons: On the Existence, Impact, and Origin of Hallucination-Associated Neurons in LLMs Paper β’ 2512.01797 β’ Published Dec 1, 2025 β’ 9
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper β’ 2602.05400 β’ Published Feb 5 β’ 353