leon-se/gemma-3-27b-it-FP8-Dynamic
27B • Updated • 663 • 20
Models that run well on RTX 5090
Note Runs well on vLLM, will not fit standard sglang. Very good model for the size.
Note Very fast with sglang, can improve even more with draft easiest-ai-shawn/Phi-4-EAGLE3-sharegpt-unfiltered
Note Very good at tool call and instruction following, prone to unexpected hallucinations.
Note Good balance of size and ease of running
Note On the smaller side, I recommend larger models