deepseek-ai/DeepSeek-V4-Flash Text Generation ⢠158B ⢠Updated about 10 hours ago ⢠669k ⢠⢠960
deepseek-ai/DeepSeek-V4-Pro Text Generation ⢠862B ⢠Updated about 10 hours ago ⢠787k ⢠⢠3.63k
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head Paper ⢠2601.07832 ⢠Published Jan 12 ⢠52