Why is it in F32?

#1
by Theli - opened

Hey! Grats on the great model, but why is this one in F32 while both SFT and Base use BF16? I couldn't find anything about it in the paper.

Hi thank you for the question. The model was saved in fp32 during training, and we have uploaded a bf16 version here
https://huggingface.co/ByteDance-Seed/Seed-Coder-8B-Reasoning-bf16

Sign up or log in to comment