metadata
base_model: THUDM/GLM-4-32B-0414
library_name: peft
40% Epoch checkpoint (~40M tokens seen). Producing some interesting output but inconsistent, potential target for stabilizing RL. Saving this in case it gets worse later.