view article Article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons Feb 4, 2025 • 31
Gauss Gym Datasets Collection Datasets used for the gauss gym photorealistic simulator • 4 items • Updated Oct 17, 2025 • 8
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models Paper • 2411.07232 • Published Nov 11, 2024 • 68
Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models Paper • 2410.11081 • Published Oct 14, 2024 • 18
steiner-preview Collection Reasoning models trained on synthetic data using reinforcement learning. • 3 items • Updated Oct 20, 2024 • 33
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Paper • 2408.11039 • Published Aug 20, 2024 • 63
view article Article Decoding GPT-4'o': In-Depth Exploration of Its Mechanisms and Creating Similar AI. May 21, 2024 • 35
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2, 2024 • 124
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated 3 days ago • 150
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 25 items • Updated 13 days ago • 575