EAGLE3 support?
Thanks for this model. I always love seeing what new technologies you all are pushing and making available to everyone.
Will the new diffusion method still support the EAGLE3 model for Gemma 4 26B?
Also, vllm couldnt identify the "diffusion_gemma" type in the vllm docker container you suggested.
I upgraded transformers:
uv pip install --system --upgrade transformers && vllm serve
...
This gets me past the error:
Value error, The checkpoint you are trying to load has model type `diffusion_gemma` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
But then I get other errors:
ValueError: Argument input_ids not found in the forward method of <class 'transformers.models.diffusion_gemma.modeling_diffusion_gemma.DiffusionGemmaDecoderModel'>
[rank0]:[W610 18:26:49.025723372 ProcessGroupNCCL.cpp:1575] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
...
ValueError: Argument input_ids not found in the forward method of <class 'transformers.models.diffusion_gemma.modeling_diffusion_gemma.DiffusionGemmaDecoderModel'>
I also gave the nightly container a whirl:
nightly-2c9c07c85e56c799afffd5a671a8a0bace377a39
Lastly, the "Gemma4" tag for that container doesnt seem to exist:
https://hub.docker.com/r/vllm/vllm-openai/tags?name=gemma4
But I do see your note about the vllm container and options might change and that one should keep an eye on the releases. Just wanted to document this. Thanks for the awesome models! I am very excited about Nemotron 3 Ultra! I wiped out my 4xH100 GPUs so I could run it, but sadly I still dont have enough vram ๐
vllm github PR:
https://github.com/vllm-project/vllm/pull/45163