Question about train data

by Michalea - opened May 15

May 15

Hello, thank you for the contribution.
I would like to ask you if you regenerated the data using GLM5.1 and then you trained the eagle head on regenerated data, or regeneration was skipped as it is costly process.

xiaomenshen

AQ org May 16

•

edited May 16

For GLM-5.1, the training data was not regenerated, we directly leveraged the regenerated dataset from Kimi k2.5. However, Qwen3.5 397B A22B was trained on a freshly regenerated dataset.

zhangxjohn

May 26

I would like to inquire about the resources used to train GLM5.1. Could you shere detials such as the data size, GPU models and quantities, training duration, and whether online or offline training mode was used? Additionally, what were the epoch and batch size?
Thank you for your great contribution.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment