hiitsesh's picture
download
raw
1.79 kB
Starting pilot training at Thu May 14 08:57:54 2026
Command: /usr/local/bin/python3.10 training/train_grpo.py --model-name Qwen/Qwen3-1.7B --max-steps 100 --num-generations 8 --gradient-accumulation-steps 8 --max-completion-length 512 --learning-rate 1e-05 --logging-steps 5 --output-dir /data/outputs/releaseops-grpo-pilot --metrics-json /data/outputs/grpo_env_metrics_pilot.json --best-loss-dir outputs/best_by_loss --bf16
Warning: installed TRL supports environment_factory but GRPOConfig does not expose env_kwargs_keys; dataset scenario columns may not be passed into reset(**kwargs).
Generating train split: 0 examples [00:00, ? examples/s] Generating train split: 120 examples [00:00, 6533.69 examples/s]
Skipping GRPOConfig args unsupported by installed TRL: max_prompt_length
Traceback (most recent call last):
File "/app/training/train_grpo.py", line 877, in <module>
main()
File "/app/training/train_grpo.py", line 822, in main
training_args = build_config(args)
File "/app/training/train_grpo.py", line 558, in build_config
return GRPOConfig(**config_kwargs)
File "<string>", line 179, in __init__
File "/usr/local/lib/python3.10/site-packages/trl/trainer/grpo_config.py", line 879, in __post_init__
super().__post_init__()
File "/usr/local/lib/python3.10/site-packages/trl/trainer/base_config.py", line 107, in __post_init__
super().__post_init__()
File "/usr/local/lib/python3.10/site-packages/transformers/training_args.py", line 1576, in __post_init__
self._validate_args()
File "/usr/local/lib/python3.10/site-packages/transformers/training_args.py", line 1738, in _validate_args
raise ValueError(error_message)
ValueError: Your setup doesn't support bf16/gpu. You need to assign use_cpu if you want to train the model on CPU.

Xet Storage Details

Size:
1.79 kB
·
Xet hash:
72960b689551f1920d0fb6beccd0b966199c1bd41a38960925bcf27d01006c59

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.