Question regarding Eagle3 training and MTP

#1
by jhjhjh777 - opened

Hi there, thanks for sharing this model.

I was wondering how the Eagle3 draft model was trained for GLM-4.7-Flash. Did you use a specific framework or repository for the training process?

Also, since GLM already has native support for Multi-Token Prediction (MTP), I'm curious about your main motivation for choosing to train an EAGLE3 model instead.

If you are open to sharing the training scripts, I am planning to run a performance comparison between your EAGLE3 approach and the native MTP, and share the results back with the community. It would be a great learning resource for those of us looking into custom speculative decoding setups.

Best regards,

Sign up or log in to comment