edbeeching
/

DeepScaler-DeepSeek-R1-Distill-Qwen-1.5B-GRPO

Model card Files Files and versions

DeepScaler-DeepSeek-R1-Distill-Qwen-1.5B-GRPO

3.57 GB

Ctrl+K

Ctrl+K

1 contributor

History: 2 commits

edbeeching's picture

edbeeching HF Staff

Training in progress, step 50

f13bdd8 verified 12 months ago

.gitattributes

1.57 kB
Training in progress, step 50 12 months ago
chat_template.jinja

2.06 kB
Training in progress, step 50 12 months ago
config.json

709 Bytes
Training in progress, step 50 12 months ago
model.safetensors

3.55 GB
xet

Training in progress, step 50 12 months ago
special_tokens_map.json

485 Bytes
Training in progress, step 50 12 months ago
tokenizer.json

11.4 MB
xet

Training in progress, step 50 12 months ago
tokenizer_config.json

4.49 kB
Training in progress, step 50 12 months ago
training_args.bin
Detected Pickle imports (14)
- "transformers.trainer_utils.SchedulerType",
- "transformers.trainer_utils.IntervalStrategy",
- "accelerate.utils.dataclasses.DeepSpeedPlugin",
- "accelerate.utils.dataclasses.DistributedType",
- "transformers.trainer_utils.HubStrategy",
- "torch.device",
- "transformers.training_args.OptimizerNames",
- "torch.bfloat16",
- "accelerate.state.PartialState",
- "transformers.trainer_pt_utils.AcceleratorConfig",
- "transformers.trainer_utils.SaveStrategy",
- "transformers.integrations.deepspeed.HfTrainerDeepSpeedConfig",
- "open_r1.configs.GRPOConfig",
- "transformers.integrations.deepspeed.HfDeepSpeedConfig"
How to fix it?
10.7 kB
xet

Training in progress, step 50 12 months ago