view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch May 7, 2024 • 111
view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance Apr 16 • 56
facebook/dinov3-vitl16-pretrain-lvd1689m Image Feature Extraction • 0.3B • Updated Aug 19 • 340k • 92
facebook/dinov3-vit7b16-pretrain-lvd1689m Image Feature Extraction • 7B • Updated Aug 19 • 13.2k • 196