Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
stas 
posted an update 10 days ago
Post
133
PSA for DeepSpeed users - a long outstanding precision-related critical bug has been identified and fixed in https://github.com/deepspeedai/DeepSpeed/pull/8066 and a new release has been made.

The issue was about mixed precision mode downcasting buffers that had to be in fp32 - massively impacting correctness due to large static buffers - e.g. RoPE in Qwen3 models when using long sequence lengths 32K+.

Hopefully this fix brings Deepspeed to a close parity with FSDP2 which has been an issue since a long time.

You can still have the old behavior but you'd now need to manually configure it - by default the model's buffers will now remain in the original precision.

Please install deepspeed==0.19.2 which will do the right thing.

Thanks to Tunji Ruwase and Claude Opus 4.8 via Cursor for identifying and fixing the problem.
In this post